Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

Outer Join Trouble Expand / Collapse
Author
Message
Posted Monday, January 02, 2006 5:43 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Today @ 11:24 AM
Points: 32,781, Visits: 14,942
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/sjones/outerjointrouble.asp






Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #247738
Posted Thursday, January 05, 2006 1:30 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 9:54 AM
Points: 2,866, Visits: 1,707
The behaviour of the outer join is very useful if you get an question such as

"Give me all the customers who ordered in December who didn't order in November"

You can do an INNER JOIN on the orders table to get the December figures and a LEFT JOIN on the ORDERS table and include the condition order_date BETWEEN '2005-11-01' AND '2005-11-30' in the join.

Although this way of doing things makes for a very easy to read query it isn't necessarily the most efficient way of achieving the desired result.

I found that in most situations it is actually more efficient to perform the LEFT JOIN on a derived table containing customers for November orders


LinkedIn Profile
Newbie on www.simple-talk.com
Post #248337
Posted Thursday, January 05, 2006 3:45 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, September 15, 2011 3:39 AM
Points: 4, Visits: 66

The behaviour posted in completly explainable. SQL-Server is working according to the SQL-Standards.

What most people don't know is that SQL "boolean" logic is actually a tripartite logic (a logic with three states) governed (simplified) by these additional rules:

  • state: true, false, undefined
  • a comparsion (==,<>,>,< ) with NULL is undefined (this is the reason for the IS NULL operator)
  • A and undefined is undefined
  • A or undefined is A
  • not undefined is undefined

A SELECT will deliver all rows where the WHERE clause evaluates to true.

A LEFT OUTER JOIN will match all rows from table A with all rows from table B for which the ON clause evaluatues to true. If there is no such match then it will deliver row A once with all SELECTed fields from B set to NULL.

If you apply this to the last example you can explain it:

select *
from a
left outer join b
on a.id = b.id
where b.id <> 8


This will deliver the following result before applying the WHERE clause:
id          id         
----------- ----------- 
1           NULL
2           2
3           NULL
4           4
5           NULL
6           6
7           NULL
8           8

The only rows where the WHERE clause evaluates to true are 2,4,6. Row 8 evaluates to false, all other to undefined.

select *
from a
left outer join b
on a.id = b.id
  and b.id <> 8

id          id         
----------- -----------
1           NULL
2           2
3           NULL
4           4
5           NULL
6           6
7           NULL
8           NULL


This examples select all rows from A with rows from B which matching ID unequal to 8 (2,4,6) and selects one row from A with a NULL row from B.

One must be very careful when doing logic with NULL values.

Post #248367
Posted Thursday, January 05, 2006 5:50 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, July 27, 2010 6:02 AM
Points: 8, Visits: 18

I confess that I find the note by Lutz Albers much more comprehensible than the article about outer joins itself. I have found that doing everything inside of a stored procedure that begins like this ALWAYS solves these kinds of problems:

SET QUOTED_IDENTIFIER ON
GO
SET ANSI_NULLS ON
GO

The reason it works? Because setting ANSI NULLS ON gets rid of the three-way logic (true, false, and null). I have read numerous articles telling me that setting ANSI NULLS ON is totaly wrong. To this, I respectfully disagree: I program in the real world and don't have time for debates about set theory logic. Turning them off gets rid of these mysterious problems and that's good enough for me.

Post #248395
Posted Thursday, January 05, 2006 6:09 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, April 04, 2012 9:22 PM
Points: 102, Visits: 9

Hello,
Both the article and the comment by Lutz Albers were enlightening.

When I ran into this problem, I would solve it like this:

SELECT *
FROM a
LEFT OUTER JOIN
  (SELECT * FROM b WHERE b.id <> 8) AS bx
ON a.id = bx.id

The above code clearly specifies (to me, anyway!) that excluding the value 8 should be applied to table b first, before the outer join is performed.

Thanks for the additional insight into this problem.




Regards,

Bob Monahon
Post #248401
Posted Thursday, January 05, 2006 6:16 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, March 31, 2014 1:53 PM
Points: 24, Visits: 156

I agree with the previous posts. The article and the comment from Lutz Albers certainly helped explain an issue I have dealt with many times.

Here is my quick and easy way of remembering how the where clause affects an outer join -

If the where clause contains a reference to the table from the outer join, then the join is in effect an inner join.

 

That may not be a great technical explanation but it works for me.




Jeff451
SQL Guru wannabe
Post #248407
Posted Thursday, January 05, 2006 6:31 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Friday, December 06, 2013 6:52 AM
Points: 298, Visits: 332

Steve made one very good point in his article, which you don't want to lose as this thread drifts away from joins to nulls.  The article was not about how boolean logic works with nulls, (which WAS explained quite nicely by Lutz).

A Where clause in an outer join is a filter, applied after the join takes place.

After that sinks in, all the other logic makes complete sense.  It also explains why you sometimes see SQL Server not use an index where is "seems like it should" due to a field being mentioned in a where clause.

 




Student of SQL and Golf, Master of Neither
Post #248408
Posted Thursday, January 05, 2006 7:27 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, April 02, 2012 8:02 AM
Points: 6, Visits: 11

Hi Steve,

to your question on "how can you explain that those rows are removed? One other thing to notice is that this query matches up using an inner join (the first plan). " in your second example  where b.id <> 8. I found a good article http://www.databasejournal.com/features/mssql/article.php/3399931 that says that " if your database or connection ANSI_NULL setting is set ON. An ANSI_NULL setting of ON is normally the default settings for databases and connections. When ANSI_NULL setting is set to ON, the ANSI SQL-92 standard states that any equal (=) or non equal (<> statement must equate to FALSE when compared against a NULL."

that could possible explain the reason for why the null <> 8 returns false.

Regards Canute.

Post #248436
Posted Thursday, January 05, 2006 7:41 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, December 07, 2010 8:19 AM
Points: 327, Visits: 123

Steve and all, nicely done!!!

I find the use of the additional "filter" in the ON clause facinating...

"...left outer join orders o
on c.customerid = o.customerid
and datepart(m, o.orderplacedate) = 10..."

I use A LOT of left outer joins in my work and this is the first time I've seen this type of qualifier embedded in the table join qualifier.


 

Post #248444
Posted Thursday, January 05, 2006 7:46 AM
SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: Moderators
Last Login: 2 days ago @ 3:54 PM
Points: 8,369, Visits: 733

Keep in mind the query is run thru the interpreter before it actually executes and many times the query is reformulated to the overall condition.

In you last examples the interpreter sees the fact the right table is referenced in the where clause despite it being a left join. Since the net condition is an inner join, that also would provide the fastest execution plan for the overall query, the query is optimized by the interpreter and actually executed as an inner join.

If you run

select *
 from a
 inner join b
  on a.id = b.id
 where b.id != 8

you will find you get exactly the same execution plan as

select *
 from a
 left outer join b
  on a.id = b.id
 where b.id != 8




Post #248449
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse