PHYData DBA (12/26/2014)
There are places where the M$ TSQL does not follow the ANSI SQL standard.
MHO: The IN operator and using the NOT qualifier with it are "NOT IN" that set. Not even if you add to how they handle NULLS and 3VL.
That is correct. ANSI standard defined IN as a series of OR'ed comparisons, so WHERE x IN (1, NULL) has to have the exact same effect as (x=1 OR x=NULL). For x equal to 1 this equates to True OR Unknown, which yields True; the row is returned. For x is NULL this equates to Unknown OR Unknown (because any comparison with a NULL yields Unknown, even if both operands are NULL), is Unknown. And for x = 2, this is False OR Unknown, also Unknown. Those two rows are not returned.
Also, ANSI defined x NOT IN (...) as equal to NOT (x IN (...)), so WHERE x NOT IN (1, NULL) must behave identical as WHERE NOT (x IN (1, NULL)). The inner part of that expression will, as explained above, return True for x = 1 and Unknown for x is null or any value other than 1. NOT (True) is False and NOT (Unknown) is Unknown. The NOT IN expression returns False for x = 1 (so the row is not returned). For x equal to any other value or NULL, the expression returns Unknown, so those rows, too, are skipped.
Every database product that behaves different is in violation of the ANSI standard. I have never used other products, but I would be very surprised if any of the big competitors would have a non-standard behaviour by default in this area.
To my chagrin, SQL Server does have an option (SET ANSI_NULLS OFF) that forces SQL Server to use different logic for comparing with NULL. This setting has been deprecated a long time ago, and will be removed in a future version - much to my delight, because I don't like non-standard behaviour.