LEFT JOIN vs EXCEPT

  • HI All,

    I have just come across the except tool in SQL 2005 and have been running some performance checks to see which is better in the following situation:

    SELECT id FROM tableA

    EXCEPT

    SELECT ClientId FROM tableB

    SELECT id

    FROM tableA a

    LEFT JOIN tableB b ON b.ClientID = a.id

    WHERE b.id is null

    When looking at the execution plan the EXCEPT cost only 6% where the join costs 94%

    However the time in ms of the EXCEPT is much slower and the LOGICAL reads is much higher.

    This leaves me confused 🙁

    Could someone let me know or point me in the correct direction of which is better to use and why?

    Thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • EXCEPT is not optimized to perform like a join. it's the same as doing thing as you dumping your first query into a temp table and then deleting any rows that match the second query. A properly written join or sub-select could do this more efficiently.

    Also remember that with EXCEPT both queries have to have the same columns and all columns will be compared.

  • In your case LEFT OUTER JOIN is the way to go (and the fastest)

    but with a lot of columns (and a smallish database) EXCEPT can be powerful consider this application

    table 1 is a download of a cash register (no more than 4000-1000 items)

    table 2 is a SQL table supposedly in complete synchronization with said cash register table

    EXCEPT will efficiently "pop" the discrepancies

  • So, for someone who is using 2000 still, but looking forward to 2005, EXCEPT will let me intersect the two sets, and give me the results that are not overlapping, correct?

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • Yes.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I just found something quite interesting on this question.

    I hadn't realized that BOL says, "EXCEPT returns any distinct values from the left query that are not also found on the right query." Note, "distinct". I hadn't noticed that before.

    Except will give you the non-overlaps between two queries (two tables, whatever), but it will only return one row for each distinct value.

    I found this by accident while performing some speed tests on left join vs except. Left join was returning over 8000 rows, while except was giving me 21, because of the way I generated the test data.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (7/30/2008)


    I just found something quite interesting on this question.

    I hadn't realized that BOL says, "EXCEPT returns any distinct values from the left query that are not also found on the right query." Note, "distinct". I hadn't noticed that before.

    Except will give you the non-overlaps between two queries (two tables, whatever), but it will only return one row for each distinct value.

    I found this by accident while performing some speed tests on left join vs except. Left join was returning over 8000 rows, while except was giving me 21, because of the way I generated the test data.

    This is very good to know...didn't realize that and can only imagine how frustrating tracking that down could be.

    Now, I don't have to - thanks. 😉

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • jcrawf02 (7/30/2008)


    So, for someone who is using 2000 still, but looking forward to 2005, EXCEPT will let me intersect the two sets, and give me the results that are not overlapping, correct?

    Well technically, no. What you describe is the Exclusive Union, or Disjoint Union (which is the Set equivalent of XOR in logic) which in SQL Server could be expressed as:

    (A UNION B) EXCEPT (A INTERSECT B)

    EXCEPT (which used to be called "MINUS" in some old implementations of SQL) is more like a subtraction operator. It returns only the (distinct) elements of the first set that are not in the second set. In logical operators, EXCEPT would be:

    A AND (NOT B)

    Of course that may be what you actually meant, but the details really matter when it comes to logic and set theory descriptions.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I may have to write an article about it someday... I've found that WHERE NOT IN beats the OUTER JOIN type of exclusion, WHERE NOT EXISTS correlated subqueries, and a couple of other methods... wouldn't be surprised if it beat the pants off of EXCEPT, as well. :hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Oh, yeah... almost forgot... "In the land of the blind, the one eyed man is King!"... That's why most people look at the Execution Plan and believe in it. I don't because I have "two eyes". The Percent of Batch is frequently VERY wrong and should never be considered when trying to optimize code... especially when comparing two methods to select the better performing one. I actually have code that shows that one of two queries that return identical results shows a Percent of Batch as 100% and the other shows 0%... yet, the 100% blows the 0% code away!

    Ah, but why would you simply take my word for it? (You shouldn't) 😛 Here's the offending code...

    SET NOCOUNT ON

    --=======================================================================================

    -- Recursive method shown by (Name with-held)

    --=======================================================================================

    PRINT '========== Recursive method =========='

    --===== Turn on some performance counters ===============================================

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    DECLARE @Bitbucket DATETIME --Holds display output so display times aren't measured.

    --===== Execute the code being tested ===================================================

    DECLARE @DateVal DATETIME

    SET @DateVal = '2008-01-01'

    ;with mycte as

    (

    select @DateVal AS DateVal

    union all

    select DateVal + 1

    from mycte

    where DateVal + 1 < DATEADD(yy, 5, @DateVal)

    )

    select @Bitbucket = d.dateval

    from mycte d

    OPTION (MAXRECURSION 0)

    --===== Turn off the performance counters and print a separator =========================

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',90)

    GO

    --=======================================================================================

    -- Tally table method by Jeff Moden

    --=======================================================================================

    PRINT '========== Tally table method =========='

    --===== Turn on some performance counters ===============================================

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    DECLARE @Bitbucket DATETIME --Holds display output so display times aren't measured.

    --===== Execute the code being tested ===================================================

    DECLARE @StartDate AS DATETIME

    SET @StartDate = '2008-01-01'

    SELECT TOP (DATEDIFF(dd,@StartDate,DATEADD(yy,5,@StartDate)))

    @Bitbucket = @StartDate-1+t.N

    FROM Tally t

    ORDER BY N

    --===== Turn off the performance counters and print a separator =========================

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',90)

    Tally table I used is here...

    http://www.sqlservercentral.com/articles/TSQL/62867/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • don't forget that except implicitly compares every column, not just keys. so it can be used in situations where a not exists or not in or outer join wouldn't be appropriate or practical. (would you really want to code a join between tables with 120 columns?)

  • rbarryyoung (7/30/2008)


    jcrawf02 (7/30/2008)


    So, for someone who is using 2000 still, but looking forward to 2005, EXCEPT will let me intersect the two sets, and give me the results that are not overlapping, correct?

    Well technically, no. What you describe is the Exclusive Union, or Disjoint Union (which is the Set equivalent of XOR in logic) which in SQL Server could be expressed as:

    (A UNION B) EXCEPT (A INTERSECT B)

    EXCEPT (which used to be called "MINUS" in some old implementations of SQL) is more like a subtraction operator. It returns only the (distinct) elements of the first set that are not in the second set. In logical operators, EXCEPT would be:

    A AND (NOT B)

    Of course that may be what you actually meant, but the details really matter when it comes to logic and set theory descriptions.

    That is what I actually meant, thanks, and I don't know that I really understand what I meant now, so until I do some more reading, I don't think I'll be using it. 😉 Thanks for the info!

    By the way, it's very big of you to contribute to the same thread as Jeff, when he's obviously anti-RBARry, but just can't spell your name . . .:hehe:

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • Jeff,

    How would you best two queries to see which is performing better?

    thanks

    Chris

    ----------------------------------------------
    Try to learn something about everything and everything about something. - Thomas Henry Huxley

    :w00t:
    Posting Best Practices[/url]
    Numbers / Tally Tables[/url]

    SQL-4-Life
  • jcrawf02 (8/1/2008)


    By the way, it's very big of you to contribute to the same thread as Jeff, when he's obviously anti-RBARry, but just can't spell your name . . .:hehe:

    Heh, you don't know the half of it. Check out this post and the thread that follows: http://www.sqlservercentral.com/Forums/FindPost502435.aspx 😀

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • rbarryyoung (8/1/2008)


    jcrawf02 (8/1/2008)


    By the way, it's very big of you to contribute to the same thread as Jeff, when he's obviously anti-RBARry, but just can't spell your name . . .:hehe:

    Heh, you don't know the half of it. Check out this post and the thread that follows: http://www.sqlservercentral.com/Forums/FindPost502435.aspx 😀

    THAT was a lot of fun! Barry is a heck of a good sport!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply