Understanding the difference
between “IS NULL” and “= NULL”
When a variable is created in SQL with the declare statement
it is created with no data and stored in the variable table (vtable) inside SQLs
memory space. The vtable contains the name and memory address of the variable.
However, when the variable is created no memory address is allocated to the
variable and thus the variable is not defined in terms of memory.
When you SET the variable it is allotted a memory address
and the initial data is stored in that address. When you SET the value again
the data in the memory address pointed to by the variable is then changed to
the new value.
Now for the difference and why each behaves the way it does.
“= NULL”
“= NULL” is an expression of value. Meaning, if the variable
has been set and memory created for the storage of data it has a value. A
variable can in fact be set to NULL which means the data value of the objects
is unknown. If the value has been set like so:
DECLARE @val CHAR(4)
SET @val = NULL
You have explicitly set the value of the data to unknown and
so when you do:
If @val = NULL
It will evaluate as a true expression.
But if I do:
DECLARE @val CHAR(4)
If @val = NULL
It will evaluate to false.
The reason for this is the fact that I am checking for NULL
as the value of @val. Since I have not SET the value of @val no memory address
has been assigned and therefore no value exists for @val.
Note: See section on SET ANSI_NULLS (ON|OFF) due to
differences in SQL 7 and 2000 defaults that cause examples to not work. This is
based on SQL 7.
“IS NULL”
Now “IS NULL” is a little trickier and is the preferred
method for evaluating the condition of a variable being NULL. When you use the
“IS NULL” clause, it checks both the address of the variable and the data
within the variable as being unknown. So if I for example do:
DECLARE @val CHAR(4)
If @val IS NULL
PRINT ‘TRUE’
ELSE
PRINT ‘FALSE’
SET @val = NULL
If @val IS NULL
PRINT ‘TRUE’
ELSE
PRINT ‘FALSE’
Both outputs will be TRUE. The reason is in the first @val
IS NULL I have only declared the variable and no address space for data has
been set which “IS NULL” check for. And in the second the value has been
explicitly set to NULL which “IS NULL” checks also.
SET ANSI_NULLS (ON|OFF)
Now let me throw a kink in the works. In the previous
examples you see that = NULL will work as long as the value is explicitly set.
However, when you SET ANSI_NULLS ON things will behave a little different.
Ex.
DECLARE @val CHAR(4)
SET @val = NULL
SET ANSI_NULLS ON
If @val =NULL
PRINT ‘TRUE’
ELSE
PRINT ‘FALSE’
SET ANSI_NULLS OFF
If @val =NULL
PRINT ‘TRUE’
ELSE
PRINT ‘FALSE’
You will note the first time you run the = NULL statement
after doing SET ANSI_NULLS ON you get a FALSE and after setting OFF you get a
TRUE. The reason is as follows.
Excerpt from SQL BOL article “SET ANSI_NULLS”
The SQL-92 standard requires that an equals (=) or not equal to (<>)
comparison against a null value evaluates to FALSE. When SET ANSI_NULLS is ON,
a SELECT statement using WHERE column_name = NULL returns zero rows even
if there are null values in column_name. A SELECT statement using WHERE column_name
<> NULL returns zero rows even if there are nonnull values in column_name.
When SET ANSI_NULLS is OFF, the Equals (=) and Not Equal To (<>)
comparison operators do not follow the SQL-92 standard. A SELECT statement
using WHERE column_name = NULL returns the rows with null values in column_name.
A SELECT statement using WHERE column_name <> NULL returns the
rows with nonnull values in the column. In addition, a SELECT statement using
WHERE column_name <> XYZ_value returns all rows that are not XYZ value
and that are not NULL.
End Excerpt
So as defined by SQL92, “= NULL” should always evaluate
false. So even setting the value explicitly means you will never meet the =
NULL if condition and your code may not work as intended. The biggest reason
where = NULL will shoot you in the foot is this, SQL 7 when shipped and
installed is defaulted to ANSI_NULL OFF but SQL 2000 is defaulted to ANSI_NULL
ON. Of course you can alter this several ways but if you upgraded a database
from 7 to 2000 and found the = NULL worked only when you set if explicitly when
you roll out a default 2000 server your code now breaks and can cause data
issues.
Yet another reason to use IS NULL instead as under SQL 92
guidelines it is still going to evaluate to TRUE and thus your code is safer
for upgrading the server.
Summary
If summary unless you need to check that the value of a
variable was set to equal NULL and you have set ANSI_NULLS ON, then always use
the “IS NULL” clause to validate if a variable is NULL. By using = NULL
instead you can cause yourself a lot of headaches in trying to troubleshoot
issues that may arise from it, now or unexpectedly in the future.
Basis
Some of the information provided comes from how C++ works
and how SQL behaves under each circumstance. Unfortunately, SQL as far as I
know does not have an addressof function to allow me to output the actual
memory address to show what occurs under the hood. In C++ when a variable is
created the variable has an address of 0xddddddd (in debug but it can be
different non-real addresses as well). When you set the variable the first time
checking the address will give you a valid memory address where the data is
being stored. Also, more information can be obtained from SQL Books Online in
the sections on IS NULL and SET ANSI_NULLS….