Intersect, Except, Union, All and Any

  • Comments posted to this topic are about the item Intersect, Except, Union, All and Any

  • I had similar feelings when PIVOT came out. Proves that "new" or "different" is frequently not better, faster, or easier to read. 😛

    Good article David. Thanks for taking the time to write it and to do the comparisons.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Very good article David.

    One area where I find INTERSECT / EXCEPT easier to work with is when dealing with multiple columns, such as what you might find for a PK. It would really be interesting to see a comparison between the different methods using multiple columns instead of just a single column.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Why would anyone think that INTERSECT is easier than INNER JOIN?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hey David,

    Nice article, I think you'll find the advantage with EXCEPT when you have to compare multiple columns with possible null values on both sides. When you compare null values with a JOIN or a WHERE statement you will get a "wrong" result, when using the EXCEPT statement you will get the "right" result.

    see also:http://en.wikipedia.org/wiki/Null_(SQL)#Grouping_and_sorting

    Kim

  • When I look at the article none of the images show up? I see no other comments in the forum - is it just me? For ex...

    image001 gives me nothing

  • I get no images!!!

  • No Images either. How Useful!!!

  • Nice article!

    I've used EXCEPT a few times recently - simply cos I could, I guess, and also sometimes (even if it's relatively unknown syntax) it is clearer when reading the code what is going on.

    One limiting aspect of EXCEPT (and also INTERSECT - although I haven't used it) is that the columns have to match (like they do for a UNION) whereas with NOT IN / NOT EXISTS etc. there is no such restriction.

  • Same here, no images, just the red x in a box.

  • Hello, the article appears very interesting and useful, but I am concerned that I am missing important parts of it because the image links are broken.

    Will you fix these links so we can see the entire article as you intended?

    Thanks!

  • Images would be extremely helpful.

  • Jeff Moden (7/19/2009)


    Why would anyone think that INTERSECT is easier than INNER JOIN?

    Jeff,

    In some cases intersect is easier to use than inner join, if you want to compare resultsets. For example, an inner join requires you to add EVERY column to the inner join list or to the predicate to achieve what you could accomplish by use the intersect operator. I am not saying this is the best method, but it can sometimes reduce coding and save you time. Look at the example below and imagine if you wanted to compare tables with 15+ columns. Essentially with intersect, you can copy and paste the same query on both sides of the operator.

    Note: To those reading this, you should not use select * in your select list. I did so below to demonstrate a point.

    E.g.

    DECLARE @t1 TABLE(

    id INT,

    col1 CHAR(1),

    col2 SMALLINT,

    col3 CHAR(1),

    col4 CHAR(1),

    col5 CHAR(1)

    )

    INSERT INTO @t1 VALUES (1,'a',10,'z','z','z');

    INSERT INTO @t1 VALUES (2,'b',20,'z','z','z');

    INSERT INTO @t1 VALUES (3,'c',30,'z','z','z');

    INSERT INTO @t1 VALUES (4,'d',40,'z','z','z');

    DECLARE @t2 TABLE(

    id INT,

    col CHAR(1),

    col2 SMALLINT,

    col3 CHAR(1),

    col4 CHAR(1),

    col5 CHAR(1)

    )

    INSERT INTO @t2 VALUES (1,'a',10,'z','z','z');

    INSERT INTO @t2 VALUES (2,'b',20,'z','z','z');

    INSERT INTO @t2 VALUES (4,'d',30,'z','z','z');

    INSERT INTO @t2 VALUES (5,'e',40,'z','z','z');

    --intersect

    SELECT *

    FROM @t1

    INTERSECT

    SELECT *

    FROM @t2

    --equivilant

    SELECT

    t1.*

    FROM @t1 t1

    INNER JOIN @t2 t2

    ON t1.id = t2.id

    AND t1.col1 = t2.col

    AND t1.col2 = t2.col2

    AND t1.col3 = t2.col3

    AND t1.col4 = t2.col4

    AND t1.col5 = t2.col5

    As for except, I find this operator quite useful when trying to diagnose RI violaters and resultsets that dont match between source and destination. Except has the same benefits as Intersect, which is namely reduced typing.

    I personally dont allow or use these operators in production code, but I sometimes use these operators to quickly retrieve information.

    I recently blogged about intersect and except. You can have a look if you like:

    http://jahaines.blogspot.com/2009/07/should-i-intersect-or-except.html

  • I am not near my workstation to test, but I am curious if the tests were affected by caching? Any time I have two statements that are different but produce the same timing I am suspicious. I know that these statements may actually run the same in the engine, using the same query plan, but then again the difference might be in the details there.

    Perhaps there is a difference in how the query performs when there is no data already in cache. I know that this has me interested enough that I want to test this against a massive dataset, then retest with the old query after cycling SQL.

    Great article though, this may prove very interesting.

  • Lee Hilton (7/20/2009)


    I am not near my workstation to test, but I am curious if the tests were affected by caching? Any time I have two statements that are different but produce the same timing I am suspicious. I know that these statements may actually run the same in the engine, using the same query plan, but then again the difference might be in the details there.

    Perhaps there is a difference in how the query performs when there is no data already in cache. I know that this has me interested enough that I want to test this against a massive dataset, then retest with the old query after cycling SQL.

    Great article though, this may prove very interesting.

    I know that the author is using the AdventureWorks database, but I'm curious just how much data he was testing on.

    Random Technical Stuff[/url]

Viewing 15 posts - 1 through 15 (of 86 total)

You must be logged in to reply to this topic. Login to reply