Is there a reason for the ISO behaviour of ANSI_NULL?

Question

Is there a reason for the ISO behaviour of ANSI_NULL?

codykonior

SSCommitted

Points: 1591
More actions
July 6, 2015 at 10:27 pm

#319636

I overheard someone helping a user fix their query where they had used = NULL instead of IS NULL.
Of course we know that, but I'm curious why the standard didn't alias them together so that it could be used that way. It seems it would be more intuitive and would be interested if I could explain why that use case scenario for the decision was.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

JManuelN SSCommitted Points: 1708 More actions · Answer 1

Hi,

Casually I heard just the opposite yesterday. SQL Server query engine uses additional resources to detect the use of NULLs while ANSI_NULL is set to OFF

Here is a good example

http://sqltouch.blogspot.ie/2013/05/ansinulls-joinwhere-and-query-optimizer.html

Indeed the option of using NULLs like other value will be discontinued in future versions

SET ANSI_NULL

https://msdn.microsoft.com/en-us/library/ms188048.aspx

Luis Cazares SSC Guru Points: 183698 More actions · Answer 2

The best reason that I've encountered is consistency. We can't treat NULL values the same way as other values when comparing them. What would you expect when you use SomeValue < NULL or SomeValue > NULL?

Basically, when ANSI_NULLS is OFF, the equal operator (=) won't behave the same way as any other comparison operator.

This is a simplistic reasoning to the whole subject of NULLs. I hope that this makes sense for you.

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

David Moutray SSCertifiable Points: 5956 More actions · Answer 3

Yes, there is a reason for the default behavior of ANSI_NULL, and it goes back to the core concept of relational databases.

Relational databases are based on a mathematical model (first articulated by mathematician Ted Codd back in the 1970's). That mathematical model uses "Three-Valued Logic". The logic you learn in high school, Boolean Logic, is two-valued. It only has the values TRUE and FALSE. Three-Valued Logic has, you guessed it, three values: TRUE, FALSE and UNKNOWN.

This is how our basic comparisons work in Boolean Logic. Consider two logical statements, Alpha and Beta.

The logical statement, Alpha AND Beta is TRUE if both Alpha and Beta are TRUE. If either is FALSE then the full statement is also FALSE.

Conversely, Alpha OR Beta is TRUE if either Alpha or Beta is TRUE. It is only FALSE if both Alpha and Beta are FALSE.

It isn't that much more complex with three-valued logic, and it is actually quite intuitive when you think about it for a minute:

Alpha AND Beta is TRUE if both Alpha and Beta are TRUE, and it is FALSE if either Alpha or Beta is FALSE. If either Alpha or Beta is UNKNOWN, then the statement, Alpha AND Beta is also UNKNOWN. This does make sense when you think about it for a moment. If Alpha, for instance, is UNKNOWN, that means that we don't know what it is. It could be TRUE or it could be FALSE. We just don't know. If we don't know the value of one of the terms, Alpha or Beta, then it makes sense that we don't know the value of the composite statement, Alpha AND Beta.

The possible results for Alpha OR Beta is more complex with three-valued logic than it is with two-valued, Boolean logic, but it also "makes sense" when you stop to think about it for a moment. Here is a matrix of possible results:

Alpha --- Beta --- Alpha OR Beta

TRUE --- TRUE --- TRUE

TRUE --- FALSE--- TRUE

FALSE --- TRUE --- TRUE

FALSE --- FALSE--- FALSE

So far, this is the same as Boolean Logic, now lets throw in UNKNOWN

TRUE --- ???? --- TRUE

???? --- TRUE --- TRUE

This makes sense: If either Alpha or Beta is TRUE, we don't need to know the value of the other, the statement Alpha OR Beta is TRUE

FALSE --- ???? --- UNKNOWN

???? --- FALSE--- UNKNOWN

This also makes sense, if one of Alpha or Beta is FALSE, and we don't know the value of the other, then we can't determine the value of the statement Alpha OR Beta. If the other value is FALSE, then Alpha OR Beta is also FALSE, but if the other value is TRUE, then Alpha OR Beta is TRUE. We don't know, so the value of Alpha OR Beta has to be UNKNOWN.

In SQL Server, NULL takes on the behavior of UNKNOWN in Three-Valued logic. This same reasoning applies to the comparison operators: =, >, < and <>.

Does Alpha = Beta? If one of them is NULL, then we can't be sure. They might be equal. One of the values is UNKNOWN. So, the value of the logical statement, Alpha = Beta is also UNKNOWN, that is - NULL.

I hope this helps. I see a lot of confusion around the behavior of NULL, but it makes perfect sense if you realize that SQL is based on three-valued logic.

🙂

David Moutray SSCertifiable Points: 5956 More actions · Answer 4

I should also add that this explains why the logical statement NULL = NULL also yields NULL. We don't know either value in this statement. It is possible they are equal, even if it isn't very likely. (Three-valued logic does not take probability into account. :-))

Cody Konior SSCarpal Tunnel Points: 4827 More actions · Answer 5

I understood the logic but not why = wasn't traditionally aliased to Is.

However from what both of you have said it seems a) there is a performance benefit and b) they didn't want people comparing nulls with > and < and c) they didn't like saying two nulls equal each other because they're stand in for unknowns and could be different.

So I got it.