Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Understanding NULL


Understanding NULL

Author
Message
muzahm
muzahm
SSC-Enthusiastic
SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)SSC-Enthusiastic (153 reputation)

Group: General Forum Members
Points: 153 Visits: 47
Comments posted to this topic are about the item Understanding NULL
Tao Klerks
Tao Klerks
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1393 Visits: 1249
A slightly easier way to explain might be the fully-written-out equivalents of the IN clauses.

I've taken the liberty of "paraphrasing" T-SQL Syntax a little here, as there is no way of representing boolean values (the results of boolean operations) in T-SQL (or any SQL?) that I know of.


IF(NULL IN (34, 35, NULL))PRINT 'TRUE' =>
IF(NULL = 34 OR NULL = 35 OR NULL = NULL) PRINT 'TRUE' =>
IF(UNKNOWN OR UNKNOWN OR UNKNOWN) PRINT 'TRUE' =>
IF(UNKNOWN) PRINT 'TRUE'


and


IF(34 NOT IN (35, 36, NULL))PRINT 'FALSE' =>
IF(NOT (34 = 35 OR 34 = 36 OR 34 = NULL)) PRINT 'FALSE' =>
IF(NOT (FALSE OR FALSE OR UNKNOWN)) PRINT 'FALSE' =>
IF(NOT (UNKNOWN)) PRINT 'FALSE'



Because we all know(? Smile ) that "False OR UNKNOWN" evaluates to UNKNOWN, whereas "True OR UNKNOWN" evaluates to True...

There's some great articles on SQLServerCentral.com about Nulls, eg: http://www.sqlservercentral.com/articles/Advanced+Querying/2829/


Now if you REALLY want to have some fun, turn off the ANSI_NULLS option:
SET ANSI_NULLS OFF



All of a sudden three-valued logic becomes two-valued, and you have the equivalent of:


IF(NULL IN (34, 35, NULL))PRINT 'TRUE' =>
IF(NULL = 34 OR NULL = 35 OR NULL = NULL) PRINT 'TRUE' =>
IF(FALSE OR FALSE OR TRUE) PRINT 'TRUE' =>
IF(TRUE) PRINT 'TRUE'


and


IF(34 NOT IN (35, 36, NULL))PRINT 'FALSE' =>
IF(NOT (34 = 35 OR 34 = 36 OR 34 = NULL)) PRINT 'FALSE' =>
IF(NOT (FALSE OR FALSE OR FALSE)) PRINT 'FALSE' =>
IF(NOT (FALSE)) PRINT 'FALSE'



And now everything prints as you might have expected if you came to SQL Server before 2000, which is when ANSI_NULLS became default...

(and incidentally, that's a small issue in the question, which assumed that ANSI_NULLS was off - a reasonable assumption in most environments, but still a rich source of horrible confusion when it is not)

http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
archie flockhart
archie flockhart
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1253 Visits: 1150
Good question, and good comment to explain the result and the fact that it depends on ANSI_NULLS setting.

It got me thinking about the other IN / NOT IN syntax as well:

Suppose I want a list of all the employees in NORTHWIND database who have other people reporting to them. I coud do it like this:

select * from employees where employeeid  IN (select reportsto from employees) 



and I get 2 records: employee 2 and employee 5.


Suppose I want a list of all the employees in NORTHWIND database who DON'T have anyone reporting to them. If I try it using the apparently equivalent code ...

select * from employees where employeeid NOT IN (select reportsto from employees) 



... I get no records returned.

The reason is that there is one employee with a NULL entry in "reportsto", which makes the NOT IN comparison translate to (employeeid <>2 AND employeeid <>5 AND employeeid <> NULL)

And as mentioned above, TRUE AND TRUE AND UNKNOWN equals UNKNOWN.

To make it work I need to remove the NULLS:

select * from employees where employeeid NOT IN (select reportsto from employees where reportsto is not null) 





Now I just need to check my actual code to make sure I haven't fallen for this one in real life.
Tao Klerks
Tao Klerks
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1393 Visits: 1249
Hmm, nice one, I've never seen that issue described.

That provides another reason why the following syntax is (in my opinion) preferable:


select *
from employees
left join employees as subordinates on employees.employeeid = subordinates.reportsto
where subordinates.employeeid Is Null



The main other reason I know of is performance on complex queries, where the query optimizer may sometimes evaluate the subquery (potentially passing a lot of data to the outer query) instead of doing an efficient join (I don't know the exact circumstances governing this, but I have definitely seen it happen)

Unfortunately the above syntax/approach has the disadvantage that it is less intuitive to many people Sad

I make it a rule that subqueries (but not derived tables) are to be avoided at all costs; does anyone have a more flexible rule of thumb that can allow the use of subqueries without risk of perfromance or logic issues?

http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
Marius Els
Marius Els
SSC Veteran
SSC Veteran (218 reputation)SSC Veteran (218 reputation)SSC Veteran (218 reputation)SSC Veteran (218 reputation)SSC Veteran (218 reputation)SSC Veteran (218 reputation)SSC Veteran (218 reputation)SSC Veteran (218 reputation)

Group: General Forum Members
Points: 218 Visits: 179
This question got me!

Well done muzahm

The set ansi-nulls off could off course be debated, but still, good question.

The first if I had no problem with, but the second one got me...

I learned a lot about nulls through Tao's explanation and examples. Thanks, clear, precise and to the point.

The matter on nulls can so easily lead to queries not returning expected results and if you are unaware of the implications don't even realize your query is not returning what you expect and the "bug" is left in the code.....
Sean Lange
Sean Lange
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16525 Visits: 16992
Tao Klerks (11/18/2008)
Hmm, nice one, I've never seen that issue described.

That provides another reason why the following syntax is (in my opinion) preferable:


select *
from employees
left join employees as subordinates on employees.employeeid = subordinates.reportsto
where subordinates.employeeid Is Null



The main other reason I know of is performance on complex queries, where the query optimizer may sometimes evaluate the subquery (potentially passing a lot of data to the outer query) instead of doing an efficient join (I don't know the exact circumstances governing this, but I have definitely seen it happen)

Unfortunately the above syntax/approach has the disadvantage that it is less intuitive to many people Sad

I make it a rule that subqueries (but not derived tables) are to be avoided at all costs; does anyone have a more flexible rule of thumb that can allow the use of subqueries without risk of perfromance or logic issues?


This example is really more of a find unmatched query. It would return an employee record if the employeeid was deleted but another employee record still had that ID as the reportsto in addition to those that are truly unassigned. Let's say we had EmployeeID 10 and an employee with ReportsTo as 10 (let's say ID 45), then we delete Employees where employeeid = 10. In the above query we would see employeeid 45 because the join would fail and leave the subordinates derived table as null. Smile

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Tao Klerks
Tao Klerks
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1393 Visits: 1249
Hi slange,

I think you're confusing a couple of possibilities.

The query above reports employees who have no "subordinates" - there are no employees that have their employeeid in the reportsto field.

The query that you are thinking of is almost exactly the opposite:


select *
from employees
left join employees as supervisors on employees.reportsto = supervisors.employeeid
where supervisors.employeeid Is Null
   and employees.reportsto Is Not Null



It's actually slightly different because it also has an extra WHERE criterion - making sure that the employees returned actually have a reportsto value (for a supervisor that no longer exists, or never existed).

Does this help?

http://poorsql.com for T-SQL formatting: free as in speech, free as in beer, free to run in SSMS or on your version control server - free however you want it.
Sean Lange
Sean Lange
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16525 Visits: 16992
Tao Klerks (11/18/2008)
Hi slange,

I think you're confusing a couple of possibilities.

The query above reports employees who have no "subordinates" - there are no employees that have their employeeid in the reportsto field.

The query that you are thinking of is almost exactly the opposite:


select *
from employees
left join employees as supervisors on employees.reportsto = supervisors.employeeid
where supervisors.employeeid Is Null
   and employees.reportsto Is Not Null



It's actually slightly different because it also has an extra WHERE criterion - making sure that the employees returned actually have a reportsto value (for a supervisor that no longer exists, or never existed).

Does this help?


Try this and you will see what I mean. Smile

create table #employees (employeeid int, reportsto int)

insert #employees select 10, null
insert #employees select 45, 10

--here is the original query
select *
from #employees
left join #employees as subordinates on #employees.employeeid = subordinates.reportsto
where subordinates.employeeid Is Null

--now delete the reportsto record
delete #employees where employeeid = 10

--they still show up
select *
from #employees
left join #employees as subordinates on #employees.employeeid = subordinates.reportsto
where subordinates.employeeid Is Null

drop table #employees

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
archie flockhart
archie flockhart
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1253 Visits: 1150
Yes, employee 45 SHOULD show up in this list, because nobody reports to them.

Add another record where someone reports to 45, then 45 will not appear - even if you delete its parent record:




create table #employees (employeeid int, reportsto int)

insert #employees select 10, null
insert #employees select 45, 10
insert #employees select 50, 45

--here is the original query - we find employee 50 as the only one who does not have anyone reporting to them
select *
from #employees
left join #employees as subordinates on #employees.employeeid = subordinates.reportsto
where subordinates.employeeid Is Null

--now delete the reportsto record
delete #employees where employeeid = 10

-- we still get the same answer, 45 still does not appear, even though its parent is deleted
select *
from #employees
left join #employees as subordinates on #employees.employeeid = subordinates.reportsto
where subordinates.employeeid Is Null

drop table #employees






Yelena Varshal
Yelena Varshal
Hall of Fame
Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)Hall of Fame (3.5K reputation)

Group: General Forum Members
Points: 3480 Visits: 593
As rightly mentioned by Tao, try to run it with ANSI_NULLS OFF:

set ansi_nulls off
IF(NULL IN (34, 35, NULL))PRINT 'TRUE'
IF(34 NOT IN (35, 36, NULL))PRINT 'FALSE'

and you will get a completely different answer


Regards,
Yelena Varshal

Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search