Is there a reason for the ISO behaviour of ANSI_NULL?

  • I overheard someone helping a user fix their query where they had used = NULL instead of IS NULL.

    Of course we know that, but I'm curious why the standard didn't alias them together so that it could be used that way. It seems it would be more intuitive and would be interested if I could explain why that use case scenario for the decision was.

  • Hi,

    Casually I heard just the opposite yesterday. SQL Server query engine uses additional resources to detect the use of NULLs while ANSI_NULL is set to OFF

    Here is a good example

    http://sqltouch.blogspot.ie/2013/05/ansinulls-joinwhere-and-query-optimizer.html

    Indeed the option of using NULLs like other value will be discontinued in future versions

    SET ANSI_NULL

    https://msdn.microsoft.com/en-us/library/ms188048.aspx

  • The best reason that I've encountered is consistency. We can't treat NULL values the same way as other values when comparing them. What would you expect when you use SomeValue < NULL or SomeValue > NULL?

    Basically, when ANSI_NULLS is OFF, the equal operator (=) won't behave the same way as any other comparison operator.

    This is a simplistic reasoning to the whole subject of NULLs. I hope that this makes sense for you.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Yes, there is a reason for the default behavior of ANSI_NULL, and it goes back to the core concept of relational databases.

    Relational databases are based on a mathematical model (first articulated by mathematician Ted Codd back in the 1970's). That mathematical model uses "Three-Valued Logic". The logic you learn in high school, Boolean Logic, is two-valued. It only has the values TRUE and FALSE. Three-Valued Logic has, you guessed it, three values: TRUE, FALSE and UNKNOWN.

    This is how our basic comparisons work in Boolean Logic. Consider two logical statements, Alpha and Beta.

    The logical statement, Alpha AND Beta is TRUE if both Alpha and Beta are TRUE. If either is FALSE then the full statement is also FALSE.

    Conversely, Alpha OR Beta is TRUE if either Alpha or Beta is TRUE. It is only FALSE if both Alpha and Beta are FALSE.

    It isn't that much more complex with three-valued logic, and it is actually quite intuitive when you think about it for a minute:

    Alpha AND Beta is TRUE if both Alpha and Beta are TRUE, and it is FALSE if either Alpha or Beta is FALSE. If either Alpha or Beta is UNKNOWN, then the statement, Alpha AND Beta is also UNKNOWN. This does make sense when you think about it for a moment. If Alpha, for instance, is UNKNOWN, that means that we don't know what it is. It could be TRUE or it could be FALSE. We just don't know. If we don't know the value of one of the terms, Alpha or Beta, then it makes sense that we don't know the value of the composite statement, Alpha AND Beta.

    The possible results for Alpha OR Beta is more complex with three-valued logic than it is with two-valued, Boolean logic, but it also "makes sense" when you stop to think about it for a moment. Here is a matrix of possible results:

    Alpha --- Beta --- Alpha OR Beta

    TRUE --- TRUE --- TRUE

    TRUE --- FALSE--- TRUE

    FALSE --- TRUE --- TRUE

    FALSE --- FALSE--- FALSE

    So far, this is the same as Boolean Logic, now lets throw in UNKNOWN

    TRUE --- ???? --- TRUE

    ???? --- TRUE --- TRUE

    This makes sense: If either Alpha or Beta is TRUE, we don't need to know the value of the other, the statement Alpha OR Beta is TRUE

    FALSE --- ???? --- UNKNOWN

    ???? --- FALSE--- UNKNOWN

    This also makes sense, if one of Alpha or Beta is FALSE, and we don't know the value of the other, then we can't determine the value of the statement Alpha OR Beta. If the other value is FALSE, then Alpha OR Beta is also FALSE, but if the other value is TRUE, then Alpha OR Beta is TRUE. We don't know, so the value of Alpha OR Beta has to be UNKNOWN.

    In SQL Server, NULL takes on the behavior of UNKNOWN in Three-Valued logic. This same reasoning applies to the comparison operators: =, >, < and <>.

    Does Alpha = Beta? If one of them is NULL, then we can't be sure. They might be equal. One of the values is UNKNOWN. So, the value of the logical statement, Alpha = Beta is also UNKNOWN, that is - NULL.

    I hope this helps. I see a lot of confusion around the behavior of NULL, but it makes perfect sense if you realize that SQL is based on three-valued logic.

    🙂

  • I should also add that this explains why the logical statement NULL = NULL also yields NULL. We don't know either value in this statement. It is possible they are equal, even if it isn't very likely. (Three-valued logic does not take probability into account. :-))

  • I understood the logic but not why = wasn't traditionally aliased to Is.

    However from what both of you have said it seems a) there is a performance benefit and b) they didn't want people comparing nulls with > and < and c) they didn't like saying two nulls equal each other because they're stand in for unknowns and could be different.

    So I got it.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply