Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

Fun with Outer Joins Expand / Collapse
Author
Message
Posted Sunday, September 9, 2012 5:47 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Thursday, August 27, 2015 12:57 PM
Points: 3,476, Visits: 1,892
Comments posted to this topic are about the item Fun with Outer Joins

Kenneth Fisher
I strive to live in a world where a chicken can cross the road without being questioned about its motives.
--------------------------------------------------------------------------------
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Link to my Blog Post --> www.SQLStudies.com
Post #1356525
Posted Monday, September 10, 2012 2:44 AM


SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Wednesday, August 13, 2014 1:12 AM
Points: 27, Visits: 117
Thanks for the article. In my experience (15 years SQL Server, 25 Oracle) there have been so many instances where despite hours of research and hacking the T-SQL, an outer join fails to deliver the keys for non matched rows. This appears to be a not uncommon finding. Maybe there's scope for a follow on article explaining with why this happens and how to fix it without resorting to #tables.

Men who wish to know about the world must learn about it in its particular details. - Heraclitus
Post #1356593
Posted Monday, September 10, 2012 2:52 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Friday, August 28, 2015 2:20 AM
Points: 202, Visits: 765
hi there,

maybe this little trick helps:

I personally understood this behaviour of outer joins quite a time ago when it was explained to me as follows (yes I know, very simplified...):
there is a logical order of processing the a query. At some time the outer join with its ON-clause is processed, first as an inner join, then adding the outer rows and also the NULL-values on the missing inner rows. Some time after that the WHERE-clause is processed maybe "removing" null-values.

Since I'm saying to myself "first the ON is processed, after that the WHERE is processed" it's no problem for me any more writing my outer joins in a way they are doing what I expect.
Post #1356597
Posted Monday, September 10, 2012 7:23 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, August 27, 2015 6:31 AM
Points: 2,737, Visits: 3,551
I suppose I don't understand the point. We know that when records do not exist in the RIGHT table that NULLs are returned. So why would we expect a filter condition on the RIGHT table to treat NULLs differently than we would otherwise?

i.e. Simply change the filter condition to account for NULLs

WHERE Professor.HasTenure = 'True'
AND (Class.ClassYear >= 2011 OR Class.ClassYear IS NULL)



Thanks,

Jared
SQL Know-It-All

How to post data/code on a forum to get the best help - Jeff Moden
Post #1356716
Posted Monday, September 10, 2012 7:40 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Wednesday, August 12, 2015 2:18 PM
Points: 887, Visits: 1,782
SQLKnowItAll (9/10/2012)
I suppose I don't understand the point. We know that when records do not exist in the RIGHT table that NULLs are returned. So why would we expect a filter condition on the RIGHT table to treat NULLs differently than we would otherwise?

i.e. Simply change the filter condition to account for NULLs

WHERE Professor.HasTenure = 'True'
AND (Class.ClassYear >= 2011 OR Class.ClassYear IS NULL)



id say because the above query is the classic catch all. if we can avoid the catch all performance should increase. (as always the it depends please test applies).

if we have an index on Class with ProfessorID and ClassYear we can use that index in the join. we also eliminating records we dont need early in execution and have a lower overall row count.


To Kenneth, Great article and a gave me a great way to explain join conditions vs where conditions.



For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw

Need to Split some strings? Jeff Moden's DelimitedSplit8K
Jeff Moden's Cross tab and Pivots Part 1
Jeff Moden's Cross tab and Pivots Part 2

Jeremy Oursler
Post #1356733
Posted Monday, September 10, 2012 7:48 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, August 27, 2015 6:31 AM
Points: 2,737, Visits: 3,551
capn.hector (9/10/2012)
SQLKnowItAll (9/10/2012)
I suppose I don't understand the point. We know that when records do not exist in the RIGHT table that NULLs are returned. So why would we expect a filter condition on the RIGHT table to treat NULLs differently than we would otherwise?

i.e. Simply change the filter condition to account for NULLs

WHERE Professor.HasTenure = 'True'
AND (Class.ClassYear >= 2011 OR Class.ClassYear IS NULL)



id say because the above query is the classic catch all. if we can avoid the catch all performance should increase. (as always the it depends please test applies).

if we have an index on Class with ProfessorID and ClassYear we can use that index in the join. we also eliminating records we dont need early in execution and have a lower overall row count.


To Kenneth, Great article and a gave me a great way to explain join conditions vs where conditions.
I don't see it as a catch all. I see it as defining the requirements of the query. The join condition is the id, the filter is the tenure and the year. Assessing the question posed, I wouldn't expect to see the results that the author posed; i.e. if class year is NULL, then they didn't teach that year and I don't need them in the results. I just don't understand why the expected results would be different. If someone has been working with LEFT OUTER JOINs for more than 20 minutes... why would they ever expect the query to return rows from the LEFT table when the filter is on the RIGHT? I don't put the filter in the JOIN because it is a filter, not a join condition.


Thanks,

Jared
SQL Know-It-All

How to post data/code on a forum to get the best help - Jeff Moden
Post #1356740
Posted Monday, September 10, 2012 8:07 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Wednesday, August 12, 2015 2:18 PM
Points: 887, Visits: 1,782
SQLKnowItAll (9/10/2012)
capn.hector (9/10/2012)
SQLKnowItAll (9/10/2012)
I suppose I don't understand the point. We know that when records do not exist in the RIGHT table that NULLs are returned. So why would we expect a filter condition on the RIGHT table to treat NULLs differently than we would otherwise?

i.e. Simply change the filter condition to account for NULLs

WHERE Professor.HasTenure = 'True'
AND (Class.ClassYear >= 2011 OR Class.ClassYear IS NULL)



id say because the above query is the classic catch all. if we can avoid the catch all performance should increase. (as always the it depends please test applies).

if we have an index on Class with ProfessorID and ClassYear we can use that index in the join. we also eliminating records we dont need early in execution and have a lower overall row count.


To Kenneth, Great article and a gave me a great way to explain join conditions vs where conditions.
I don't see it as a catch all. I see it as defining the requirements of the query. The join condition is the id, the filter is the tenure and the year. Assessing the question posed, I wouldn't expect to see the results that the author posed; i.e. if class year is NULL, then they didn't teach that year and I don't need them in the results. I just don't understand why the expected results would be different. If someone has been working with LEFT OUTER JOINs for more than 20 minutes... why would they ever expect the query to return rows from the LEFT table when the filter is on the RIGHT? I don't put the filter in the JOIN because it is a filter, not a join condition.


while i agree with you that the join condition is the id and the filter is tenure and year, i disagree that the join on id should be the only thing in the join conditions. as far as the expected results, well users want what they want and we are there to provide.

(Class.ClassYear >= 2011 OR Class.ClassYear IS NULL) while not exactly a catch all with a passed variable you are still using the form (something = constant or something IS NULL). if we can eliminate the second check for null (even by not putting a "filter" in the where clause but in the join) we should speed up the query.

this of course is a great situation for the it depends, test both ways, or what does your code reviewer say answer to which is correct.



For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw

Need to Split some strings? Jeff Moden's DelimitedSplit8K
Jeff Moden's Cross tab and Pivots Part 1
Jeff Moden's Cross tab and Pivots Part 2

Jeremy Oursler
Post #1356759
Posted Monday, September 10, 2012 9:20 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Thursday, August 27, 2015 12:57 PM
Points: 3,476, Visits: 1,892
First I would like to appologize if my choice of test data is throwing people off. The data and table layout was meant to be simple and easy to follow so it didn't get in the way of the description of the difference between the ON clause and the WHERE clause.

Second I'll address a few points.

The correct terms are "preserved table " and "unpreserved table"

Never heard of either. What are they the correct terms for?


and your design is wrong.

My design is correct for the purpose. It's simple. I didn't need or want a complicated design to demonstrate a subject that can be complicated enough for a lot of people.

we do not use BIT flags in SQL

"We" do use BITs quite often. They are a nice small data type where I can fit 8 of them into a byte. Unless you have a good reason not to use them that I've never heard of I think I'll continue to not waste disk space.

Your life will be easier if you learn to use row constructors:

INSET INTO Professors
VALUES (1, 'Dr Coke', '2011-01-01'),
(2, 'Dr Sprite', '2011-01-01'),
(3, 'Dr 7-up', '2011-01-01'),
(4, 'Mr Pepper', NULL),
(5, 'Mr Fanta', NULL);

I do in fact know how to use this method of inserting. However it only works on 2008 and above and not everyone (believe it or not) is done using 2005 (and even lower).

Now try your queries with a proper schema.

My queries work exactly the same. Which is why I chose to use a very simple table layout. If I had chosen to make a realistic schema I could have added quite a few more columns, tables, defaults, constraints, foreign keys etc. That not being the point of the article I chose not to.

SQLKnowItAll (9/10/2012)
I suppose I don't understand the point. We know that when records do not exist in the RIGHT table that NULLs are returned. So why would we expect a filter condition on the RIGHT table to treat NULLs differently than we would otherwise?

i.e. Simply change the filter condition to account for NULLs

WHERE Professor.HasTenure = 'True'
AND (Class.ClassYear >= 2011 OR Class.ClassYear IS NULL)


You can absolutly do it this way. Personally I prefer the other, and I do consider the ClassYear part of the join in this case. But that comes down to personal preference I think. I will say that you should consider yourself lucky to understand it so well. I still get around 1 or 2 questions a week on this subject. Which is why I wrote the article in the first place. It tends to be a difficult subject for some people. Not everyone, but in my experience the vast majority.


Kenneth Fisher
I strive to live in a world where a chicken can cross the road without being questioned about its motives.
--------------------------------------------------------------------------------
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Link to my Blog Post --> www.SQLStudies.com
Post #1356820
Posted Monday, September 10, 2012 10:04 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, August 28, 2015 9:13 AM
Points: 96, Visits: 846
I really learned from this article! I've been working with SQL Server for 7 years now and I had no idea the results were different if I took a filter from the WHERE clause and put it into the ON clause...I always assumed the results were going to be the same. I personally knew the right table results were being filtered with the WHERE clause (Dr Sprite and Dr 7-Up were being excluded) because I've been down that bumpy road many times. I've always dealt with this situation by adding "ISNULL" into the where clause.

AND ISNULL(class.classyear,2011)>=2011

Not a pretty solution but NOW I know there is a better way!

Thanks for the article AND it's simplicity. I get what Joe Celko was driving at, but I find simple examples to be refreshing...though it could probably be argued that is why I'm a 7 year veteran and still learning what many DBA's consider elementary.
Post #1356857
Posted Monday, September 10, 2012 11:00 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Tuesday, April 21, 2015 7:00 AM
Points: 441, Visits: 1,801
The way I always remember this is thus: If you refer to the right hand table anywhere is the where clause, you immediately turn that left join into a inner join. That's the effect you see anyway.

Putting the predicate in the ON clause keeps it a left join.

Unless of course you are doing an anti-join, where you want all records from Set A not in Set B, so you use
select cols from Set_A a
left join Set_B b
On a.col1 = b.col1
where b.col1 is NULL

That is the only time I refer to a column from the right hand table involved in a left outer join in the where clause.

HTH

Dave J



http://glossopian.co.uk/
"I don't know what I don't know."
Post #1356893
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse