Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 123»»»

LEFT JOIN vs EXCEPT Expand / Collapse
Author
Message
Posted Monday, July 28, 2008 4:18 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Wednesday, February 24, 2010 4:10 AM
Points: 1,553, Visits: 2,232
HI All,

I have just come across the except tool in SQL 2005 and have been running some performance checks to see which is better in the following situation:
SELECT id FROM tableA
EXCEPT
SELECT ClientId FROM tableB

SELECT id
FROM tableA a
LEFT JOIN tableB b ON b.ClientID = a.id
WHERE b.id is null

When looking at the execution plan the EXCEPT cost only 6% where the join costs 94%

However the time in ms of the EXCEPT is much slower and the LOGICAL reads is much higher.

This leaves me confused :-(

Could someone let me know or point me in the correct direction of which is better to use and why?

Thanks
Chris


----------------------------------------------
Try to learn something about everything and everything about something. - Thomas Henry Huxley


Posting Best Practices
Numbers / Tally Tables

SQL-4-Life
Post #541726
Posted Monday, July 28, 2008 6:26 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Wednesday, September 10, 2014 8:41 AM
Points: 438, Visits: 908
EXCEPT is not optimized to perform like a join. it's the same as doing thing as you dumping your first query into a temp table and then deleting any rows that match the second query. A properly written join or sub-select could do this more efficiently.

Also remember that with EXCEPT both queries have to have the same columns and all columns will be compared.
Post #541817
Posted Tuesday, July 29, 2008 6:48 AM


SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, July 7, 2014 7:19 AM
Points: 250, Visits: 537
In your case LEFT OUTER JOIN is the way to go (and the fastest)

but with a lot of columns (and a smallish database) EXCEPT can be powerful consider this application

table 1 is a download of a cash register (no more than 4000-1000 items)
table 2 is a SQL table supposedly in complete synchronization with said cash register table

EXCEPT will efficiently "pop" the discrepancies
Post #542580
Posted Wednesday, July 30, 2008 10:42 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 12:57 PM
Points: 2,677, Visits: 19,270
So, for someone who is using 2000 still, but looking forward to 2005, EXCEPT will let me intersect the two sets, and give me the results that are not overlapping, correct?

---------------------------------------------------------
How best to post your question
How to post performance problems
Tally Table:What it is and how it replaces a loop

"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
Post #543703
Posted Wednesday, July 30, 2008 2:48 PM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
Yes.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #543914
Posted Wednesday, July 30, 2008 3:02 PM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
I just found something quite interesting on this question.

I hadn't realized that BOL says, "EXCEPT returns any distinct values from the left query that are not also found on the right query." Note, "distinct". I hadn't noticed that before.

Except will give you the non-overlaps between two queries (two tables, whatever), but it will only return one row for each distinct value.

I found this by accident while performing some speed tests on left join vs except. Left join was returning over 8000 rows, while except was giving me 21, because of the way I generated the test data.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #543918
Posted Wednesday, July 30, 2008 3:33 PM
SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Today @ 7:26 AM
Points: 4,390, Visits: 9,536
GSquared (7/30/2008)
I just found something quite interesting on this question.

I hadn't realized that BOL says, "EXCEPT returns any distinct values from the left query that are not also found on the right query." Note, "distinct". I hadn't noticed that before.

Except will give you the non-overlaps between two queries (two tables, whatever), but it will only return one row for each distinct value.

I found this by accident while performing some speed tests on left join vs except. Left join was returning over 8000 rows, while except was giving me 21, because of the way I generated the test data.


This is very good to know...didn't realize that and can only imagine how frustrating tracking that down could be.

Now, I don't have to - thanks. ;)


Jeffrey Williams
Problems are opportunites brilliantly disguised as insurmountable obstacles.

How to post questions to get better answers faster
Managing Transaction Logs
Post #543936
Posted Wednesday, July 30, 2008 4:49 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Thursday, June 5, 2014 10:54 AM
Points: 9,902, Visits: 9,480
jcrawf02 (7/30/2008)
So, for someone who is using 2000 still, but looking forward to 2005, EXCEPT will let me intersect the two sets, and give me the results that are not overlapping, correct?

Well technically, no. What you describe is the Exclusive Union, or Disjoint Union (which is the Set equivalent of XOR in logic) which in SQL Server could be expressed as:
(A UNION B) EXCEPT (A INTERSECT B)

EXCEPT (which used to be called "MINUS" in some old implementations of SQL) is more like a subtraction operator. It returns only the (distinct) elements of the first set that are not in the second set. In logical operators, EXCEPT would be:
A AND (NOT B)

Of course that may be what you actually meant, but the details really matter when it comes to logic and set theory descriptions.


-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Post #543965
Posted Wednesday, July 30, 2008 8:50 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:34 PM
Points: 37,098, Visits: 31,648
I may have to write an article about it someday... I've found that WHERE NOT IN beats the OUTER JOIN type of exclusion, WHERE NOT EXISTS correlated subqueries, and a couple of other methods... wouldn't be surprised if it beat the pants off of EXCEPT, as well.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #544034
Posted Wednesday, July 30, 2008 9:32 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:34 PM
Points: 37,098, Visits: 31,648
Oh, yeah... almost forgot... "In the land of the blind, the one eyed man is King!"... That's why most people look at the Execution Plan and believe in it. I don't because I have "two eyes". The Percent of Batch is frequently VERY wrong and should never be considered when trying to optimize code... especially when comparing two methods to select the better performing one. I actually have code that shows that one of two queries that return identical results shows a Percent of Batch as 100% and the other shows 0%... yet, the 100% blows the 0% code away!

Ah, but why would you simply take my word for it? (You shouldn't) :P Here's the offending code...

SET NOCOUNT ON
--=======================================================================================
-- Recursive method shown by (Name with-held)
--=======================================================================================
PRINT '========== Recursive method =========='
--===== Turn on some performance counters ===============================================
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @BitBucket DATETIME --Holds display output so display times aren't measured.

--===== Execute the code being tested ===================================================
DECLARE @DateVal DATETIME
SET @DateVal = '2008-01-01'

;with mycte as
(
select @DateVal AS DateVal
union all
select DateVal + 1
from mycte
where DateVal + 1 < DATEADD(yy, 5, @DateVal)
)
select @BitBucket = d.dateval
from mycte d
OPTION (MAXRECURSION 0)

--===== Turn off the performance counters and print a separator =========================
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',90)
GO

--=======================================================================================
-- Tally table method by Jeff Moden
--=======================================================================================
PRINT '========== Tally table method =========='
--===== Turn on some performance counters ===============================================
SET STATISTICS IO ON
SET STATISTICS TIME ON
DECLARE @BitBucket DATETIME --Holds display output so display times aren't measured.

--===== Execute the code being tested ===================================================
DECLARE @StartDate AS DATETIME
SET @StartDate = '2008-01-01'

SELECT TOP (DATEDIFF(dd,@StartDate,DATEADD(yy,5,@StartDate)))
@BitBucket = @StartDate-1+t.N
FROM Tally t
ORDER BY N

--===== Turn off the performance counters and print a separator =========================
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',90)

Tally table I used is here...
http://www.sqlservercentral.com/articles/TSQL/62867/


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #544042
« Prev Topic | Next Topic »

Add to briefcase 123»»»

Permissions Expand / Collapse