SQLServerCentral Article

The Semantics of NULL in SQL Server 2008

Null values in SQL Server have long been a cause of grief for many query designers, as they realize along the way that NULL values require special handling. The main cause of confusion, I would say, is thinking that NULL means blank or empty. In this article we summarize the semantics of NULL and provide some examples of its use and its consequences for query designers.

The nullability of a column determines whether the rows in the table can contain a null value for that column. For example, in the following table-creation script, column FirstName does not allow NULLs, but LastName does:

 CREATE TABLE [dbo].[Person](
      [FirstName] [varchar](50) NOT NULL,
      [LastName] [varchar](50) NULL
 ) ON [PRIMARY]

A NULL value in SQL Server is (by design) meant to indicate an unknown or undefined value, and as such, is different from any other value, including another NULL. Let's look at some examples to see what this all means.

Any comparisons involving NULLs will have a NULL result, like these ones:

     NULL <> NULL
      NULL > 5

Also, any non-logical expressions involving NULLs have an unkown, or NULL, result. For example:

     NULL + 4

returns a result of NULL.

AND and OR tables with NULL/UNKNOWN values

This next table summarizes the effect of NULLs in AND expressions:

AND True False NULL
True True False NULL
False False False False
NULL NULL False NULL

For example, the result of the expression

NULL and True

is NULL. But note that

NULL and False

is False, since for an entire AND expression to result in True all of its operands must be true, and in this case we are certain that at least of them is False.

This second table summarizes the effect of NULLs in OR expressions:

OR True False NULL
True True True True
False True False NULL
NULL True NULL NULL

For example, the result of the expression

                NULL or False

is NULL. But note that

     NULL or True

is True, since for an entire OR expression to result in True at least one of its operands must be true, and in this case we are certain that at least of them is True.

How to account for NULL values in expressions

If you write expressions using columns that may have NULL values, you must explicitly program for the possibility of NULLs. The way to do this is with the IS NULL and IS NOT NULL operators and with the ISNULL and COLASCE functions. For example, if you want to explicitly test whether some value is NULL, the correct way of doing it is

     MyValue IS NULL

The incorrect way would be

     MyValue = NULL

because, as we mentioned above, any logical expression involving NULLs results in NULL, which is not True. So, if MyValue were in fact NULL, this second expression would not be true!

The ISNULL function takes two arguments. It returns the value of the first one if it is not NULL, and the value of the second one if the first one is NULL. For example,

     ISNULL('jane', 'john')

returns 'jane', and

     ISNULL(NULL, 'john')

returns 'john'.

Finally, the COALESCE function takes one or more parameters and returns the first one that is not null. For instance:

                COALESCE('john', 'jane', NULL)

evaluates to 'john', and

                COALESCE(NULL, 'john', 'jane')

also evaluates to 'john'. To see the relationship between ISNULL and COALESCE, note that these two expressions are equivalent:

     COALESCE(@MyVariable, 0)
      ISNULL(@MyVariable, 0)

because if @MyVariable is NULL, both will evaluate to 0, and if it's not, both will evaluate to the value of @MyVariable.

A more concrete example

Let's do a concrete example, to see the effects of all of these rules. Consider this table:

 CREATE TABLE [dbo].[tbl](
      [number] [int] NULL,
      [string] [varchar](50) NULL,
      [boolean] [bit] NULL
 ) ON [PRIMARY]

Note that all of the columns allow NULLs. Now we this data in the table:

number string boolean
5 jane 1
3 NULL
NULL NULL NULL

Let's say we wanted to know all of the rows in which the number column is null. Consider this query to accomplish this goal:

 select * from tbl
 where number = null

Let's think: any logical expression involving NULL results in NULL, which is not True. Thus, the WHERE clause fails to be True for every row, and our result is 0 rows. A correct query would be:

 select * from tbl
 where number is null

which results in:

number string boolean
NULL NULL NULL

as expected.

As a second example, let's say we wanted to know all of the people whose name is not 'jane'. One would think to do it like this:

      select * from tbl
      wherestring <> 'jane'

but it returns

number string boolean
3 NULL

This result is missing the row where string is NULL, because the expression

     NULL <> 'jane'

evaluates to NULL . A correct query would be:

      select * from tbl
      where (string is NULL) or(string <> 'jane')

but which could be shortened using the ISNULL function:

      select * from tbl
      where ISNULL(string,'') <>'jane'

Now one last example, using the boolean column. If we wanted to know all of the rows with a True value in this column, this would be correct:

      select * from tbl
      whereboolean = 1

but don’t be tricked into thinking that this query gives you all of the non-true rows:

      select * from tbl
      whereboolean = 0

Since the column allows NULLs, this would be a way of accomplishing this task:

      select * from tbl
      where isnull(boolean,1)= 0

In summary: if a column allows NULL values, your queries need to account for them.

Keys and indexing

When defining table schemas and deciding whether or not to have columns that allow NULL values, you need to take into consideration the keys and indexes you will need. This list summarizes how NULL values impact key and index definitions in SQL Server 2008:

  • Nulls cannot be used in PK columns.
  • Nulls can be used in FK columns. (We have quite often encountered designs where we must allow NULLs in FK columns to indicate an unknown or not-yet-established relationship.)
  • Nulls can be used in unique indexes and they count as a unique value. That is, one row, but not two, may have NULL as its value for the indexed column.
  • Nulls can be used in non-unique indexes.

Design considerations

One way to avoid worrying about NULLs is never to use them, always declaring columns as not allowing NULLs and designating default values for "empty" or "unknown". This will save you keystrokes, especially when you want to check whether a column does not have a certain value. However, you will anyway probably have to write logic for when columns have the designated default values.

One last note

The handling of NULLs as we have summarized follows the ANSI standard. However, Transact-SQL offers an extension for null processing: If the option ANSI_NULLS is set to OFF, comparisons between nulls, such as NULL = NULL, evaluate to TRUE.

References

NULL Values (http://msdn.microsoft.com/en-us/library/ms191504.aspx), SQL Server 2008 Online Help.

About the author

Adolfo J. Socorro, Ph.D., leads eSolutions, a firm specialized in developing custom-made software.

Rate

3.99 (79)

You rated this post out of 5. Change rating

Share

Share

Rate

3.99 (79)

You rated this post out of 5. Change rating