Find Customers Who Bought "A" and "B" But Not "C" (SQL Spackle)

  • Jonathan AC Roberts (3/29/2012)


    mister.magoo (3/29/2012)


    Hi Jeff,

    Great Spackle, thanks!

    In my tests, using your code to build a million row test, I found this method to be twice as fast for the same logical reads.

    SELECT DISTINCT

    CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'A'

    AND EXISTS (SELECT

    1

    FROM #Purchase p2

    WHERE p2.CustomerID = p1.CustomerID

    AND p2.ProductCode = 'B')

    AND NOT EXISTS (SELECT

    1

    FROM #Purchase p3

    WHERE p3.CustomerID = p1.CustomerID

    AND p3.ProductCode = 'C')

    Nice way to do it. I find a lot of developers almost totally overlook use of EXISTS but it can be really efficient.

    This is slightly quicker using the results that have both products 'A' and 'B' to check that there is not a 'C' :

    SELECT DISTINCT CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'A'

    AND EXISTS(SELECT 1

    FROM #Purchase p2

    WHERE p2.CustomerID = p1.CustomerID

    AND p2.ProductCode = 'B'

    AND NOT EXISTS (SELECT 1

    FROM #Purchase p3

    WHERE p3.CustomerID = p2.CustomerID

    AND p3.ProductCode = 'C'))

    Thanks for the compliment and right back at you, that certainly is more efficient with the test data I have 🙂

    Pure speed aside, I like the simplicity of the EXCEPT but would prefer this syntax as a training tool because it clearly defines how set theory is applied to this problem:

    SELECT DISTINCT CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'A'

    INTERSECT

    SELECT DISTINCT CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'B'

    EXCEPT

    SELECT DISTINCT CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'C'

    (Even though I put speed aside, that one is pretty good at 10 million : Scan count 29, logical reads 1520, CPU time = 265 ms, elapsed time = 418 ms )

    MM



    select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);

  • Forum Etiquette: How to post Reporting Services problems
  • [/url]
  • Forum Etiquette: How to post data/code on a forum to get the best help - by Jeff Moden
  • [/url]
  • How to Post Performance Problems - by Gail Shaw
  • [/url]

  • Arjun S (3/28/2012)


    Another way to get the result would be to use Intersect and then combine it with Except. The distinct part is handled implicitly.

    --===== Find Customers that bought both "A" AND "B"

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('A')

    INTERSECT

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('B')

    EXCEPT

    --===== Find Customers that bought "C".

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('C')

    ;

    Not only does it come up with the correct answer, it's quite fast, too, because it doesn't have to do an aggregate count. Nicely done!

    ------------------------------------------------------------------------------

    PRINT '========== Code From Article ==========================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('A','B')

    GROUP BY CustomerID

    HAVING COUNT(DISTINCT ProductCode) = 2

    EXCEPT

    --===== Find Customers that bought "C".

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('C')

    ) d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Arjun S ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    --===== Find Customers that bought both "A" AND "B"

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('A')

    INTERSECT

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('B')

    EXCEPT

    --===== Find Customers that bought "C".

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('C')

    )d ;

    SET STATISTICS TIME OFF;

    Results:

    ========== Code From Article ===========================================

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 269 ms.

    ========== Arjun S =====================================================

    SQL Server Execution Times:

    CPU time = 78 ms, elapsed time = 86 ms.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • David Rueter (3/29/2012)


    Thanks, Jeff.

    I thought my standard pattern of simple joins would do about as well:

    SELECT DISTINCT p.CustomerID

    FROM

    #Purchase p

    JOIN #Purchase p2 ON

    p.CustomerID = p2.CustomerID AND

    p2.ProductCode = 'B'

    LEFT JOIN #Purchase p3 ON

    p.CustomerID = p3.CustomerID AND

    p3.ProductCode = 'C'

    WHERE

    p.ProductCode = 'A' AND

    p3.CustomerID IS NULL

    Indeed, at 1 million rows performance is similar:

  • Jeff's solution from article: CPU time = 90 ms, elapsed time = 218 ms.
  • My "JOIN" solution above: CPU time = 100 ms, elapsed time = 289 ms.
  • "HAVING SUM" solution suggested by others: CPU time = 711 ms, elapsed time = 954 ms.
  • But at 10 million rows there is a big difference:

  • Jeff's solution from article: CPU time = 701 ms, elapsed time = 911 ms
  • My "JOIN" solution above: CPU time = 4036 ms, elapsed time = 4293 ms.
  • "HAVING SUM" solution suggested by others: CPU time = 6249 ms, elapsed time = 6532 ms.
  • Thanks as always for sharing your knowledge.

    Hi, David.

    Thanks a lot for those numbers. Much appreciated!

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • Toby Harman (3/29/2012)


    One concern I had was looking at the query plan, the EXCEPT does a clustered index scan. That may be a function of the small data set, but I am not sure.

    On the million row test data included in the "Resources" section near the very end of the article, EXCEPT did an index seek, so your "small data set" suspicions were correct.

    Also, the INCLUDE index actually slowed the code from the article down.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just goes to show that adding an index aint always the best approach, even if it seems logical at the time!

  • James Dingle-651585 (3/29/2012)


    Comparing syntax without an idea of the performance signature is a bit useless IMHO.

    I'm 100% in agreement with you. That's why I stated the following in the article. 😉

    For those that want to explore and compare solutions of their own, I've attached code to build a million row test table at the bottom of this article in the "Resources" link.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wavesailor (3/29/2012)


    Thanks for article and comments - I learnt a lot.

    In MySQL 😀 this is the fastest by far:

    SELECT

    CustomerId

    FROM Purchase

    WHERE ProductCode IN ('A','B', 'C')

    GROUP BY CustomerID

    having sum(case when ProductCode = 'A' then 1 else 0 end) > 0

    and sum(case when ProductCode = 'B' then 1 else 0 end) > 0

    and sum(case when ProductCode = 'C' then 1 else 0 end) = 0

    Unfortunately MySQL does not have the EXCEPT operator.

    -jj

    I don't know if MySQL can take it but, if you haven't already done so, you might want to try one of the EXISTS/NOT EXISTS methods. I'll post "race" results for all the different methods soon.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • venoym (3/29/2012)


    I haven't tested the performance as I don't have a setup to do that,

    Heh... sure you do. I mentioned that I attached code to the million row setup right in the article and the code is in the resources section at the end of the article as a link.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • codebyo (3/29/2012)


    SELECT

    CustomerId

    FROM Purchase

    WHERE ProductCode IN ('A','B', 'C')

    GROUP BY CustomerID

    having sum(case when ProductCode = 'A' then 1 else 0 end) > 0

    and sum(case when ProductCode = 'B' then 1 else 0 end) > 0

    and sum(case when ProductCode = 'C' then 1 else 0 end) = 0

    That's a very interesting method! I'll take note of it just in case I need something like this.

    One question though:

    Wouldn't the query below present the same results?

    SELECT

    CustomerId

    FROM Purchase

    WHERE ProductCode IN ('A','B', 'C')

    GROUP BY CustomerID

    having sum(case when ProductCode = 'C' then 1 else 0 end) = 0

    Sorry, Andre... it returned 1,2,3,6,7. Should only return 1,2,3.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Toby Harman (3/29/2012)


    Hate to point this out, but ColdCoffee's approach is returning inaccurate results.

    Specifically, it returns customers 6 and 7 from the original test data.

    Not sure what happened there Toby, but ColdCoffee's code is returning the correct result for me when I play it on my machine against the original data.

    {EDIT} Ah... I see. He fixed it later.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... Ok... first, I have to say that I'm amazed. This particular article got twice as many reads in 1/4 the time as the "Generating test data" article did just this previous Monday.

    I'll also say thank you all very much for participating in the discussion and for taking the time to post code and/or provide feedback. What's really cool is that I come up with a simple idea and good folks like yourselves try to make it better and usually succeed. Several of you posted code that runs significantly faster than mine with Arjun S coming up with the fastest (at least on my ol' war horse of a desktop box).

    Here's how I tested all of the solutions what worked according to a separate test against the original data provided in the article. I 'threw away" the results in a variable to take the display out of the picture without throwing away the print statements or rowcounts.

    PRINT '========== Code From Article ==========================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('A','B')

    GROUP BY CustomerID

    HAVING COUNT(DISTINCT ProductCode) = 2

    EXCEPT

    --===== Find Customers that bought "C".

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('C')

    ) d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Arjun S ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('A')

    INTERSECT

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('B')

    EXCEPT

    --===== Find Customers that bought "C".

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('C')

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== tommyh ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT

    CustomerId

    FROM #Purchase

    WHERE ProductCode IN ('A','B', 'C')

    GROUP BY CustomerID

    having sum(case when ProductCode = 'A' then 1 else 0 end) > 0

    and sum(case when ProductCode = 'B' then 1 else 0 end) > 0

    and sum(case when ProductCode = 'C' then 1 else 0 end) = 0

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Toby Harman ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT a.CustomerID

    FROM #Purchase a

    LEFT OUTER JOIN #Purchase b

    ON b.CustomerID = a.CustomerID

    AND b.ProductCode IN ('C')

    WHERE a.ProductCode IN ('A','B')

    AND b.CustomerID IS NULL

    GROUP BY a.CustomerID

    HAVING COUNT(DISTINCT a.ProductCode) = 2

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== James Dingle ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SelectDistinct

    CustomerID

    From#Purchase

    WhereProductCode = 'A'

    AndCustomerID In

    (

    SelectCustomerID

    From#Purchase

    WhereProductCode = 'B'

    )

    AndCustomerID Not In

    (

    SelectCustomerID

    From#Purchase

    WhereProductCode = 'C'

    )

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== chintan.j.gandhi ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT DISTINCT CustomerID

    FROM #Purchase P

    WHERE EXISTS (SELECT 1 FROM #Purchase P1

    WHERE P.customerid = P1.customerid

    AND P1.productcode = 'A'

    )

    AND EXISTS (SELECT 1 FROM #Purchase P2

    WHERE P.customerid = P2.customerid

    AND P2.productcode = 'B'

    )

    AND NOT EXISTS (SELECT 1 FROM #Purchase P3

    WHERE P.customerid = P3.customerid

    AND P3.productcode = 'C'

    )

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Dalibor Margotic CTE (First method already posted by Arjun S) ====='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT;

    WITH CTE AS

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode = 'A'

    GROUP BY CustomerID

    )

    ,CTE2 AS

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode = 'B'

    GROUP BY CustomerID

    )

    ,CTE3 AS

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode = 'C'

    GROUP BY CustomerID

    )

    select @Bitbucket = CTE.CustomerID from CTE

    left join CTE3 ON

    CTE.CustomerID = CTE3.CustomerID

    where CTE.CustomerID in (SELECT CustomerID FROM CTE2)

    AND CTE3.CustomerID IS NULL

    GROUP BY CTE.CustomerID

    ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== MAGOO ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT DISTINCT

    CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'A'

    AND EXISTS (SELECT

    1

    FROM #Purchase p2

    WHERE p2.CustomerID = p1.CustomerID

    AND p2.ProductCode = 'B')

    AND NOT EXISTS (SELECT

    1

    FROM #Purchase p3

    WHERE p3.CustomerID = p1.CustomerID

    AND p3.ProductCode = 'C')

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Venoym ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT

    b.CustomerID

    FROM

    (

    SELECT

    a.CustomerID

    FROM

    (

    SELECT

    DISTINCT

    CustomerID,

    ProductCode

    FROM

    #Purchase

    WHERE

    ProductCode IN ('A','B')) AS a

    GROUP BY

    a.CustomerID

    HAVING

    COUNT(ProductCode) = 2

    ) AS b LEFT OUTER JOIN

    #Purchase p ON b.CustomerID = p.CustomerID AND p.ProductCode = 'C'

    WHERE

    p.ProductCode IS NULL

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Jonathan AC Roberts ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT DISTINCT CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'A'

    AND EXISTS(SELECT 1

    FROM #Purchase p2

    WHERE p2.CustomerID = p1.CustomerID

    AND p2.ProductCode = 'B'

    AND NOT EXISTS (SELECT 1

    FROM #Purchase p3

    WHERE p3.CustomerID = p2.CustomerID

    AND p3.ProductCode = 'C'))

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== bob.probst ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    select distinct a.customerid

    from #purchase a

    inner join #purchase b on (a.customerid = b.customerid)

    left join #purchase c on (a.customerid = c.customerid and c.productcode = 'C')

    where a.productcode = 'A'

    and b.productcode = 'B'

    and c.purchaseid is null

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Vyengr ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    Select Distinct CustomerID

    FROM #Purchase PU

    WHERE ProductCode ='A'

    AND EXISTS (Select CustomerID

    FROM #Purchase

    WHERE ProductCode ='B' and CustomerID = PU.CustomerID)

    AND NOT EXISTS (Select PU.CustomerID

    FROM #Purchase

    WHERE ProductCode ='C' and CustomerID = PU.CustomerID)

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== ejoell 66477 ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    Select Distinct ABuyers.CustomerID from

    (Select CustomerID FROM #Purchase WHERE ProductCode = 'A') ABuyers

    JOIN

    (Select CustomerID FROM #Purchase WHERE ProductCode = 'B') BBuyers

    ON ABuyers.CustomerID = BBuyers.CustomerID

    LEFT JOIN

    (Select CustomerID FROM #Purchase WHERE ProductCode = 'C') CBuyers

    ON ABuyers.CustomerID = CBuyers.CustomerID

    WHERE CBuyers.CustomerID is null

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== David Rueter ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT DISTINCT p.CustomerID

    FROM

    #Purchase p

    JOIN #Purchase p2 ON

    p.CustomerID = p2.CustomerID AND

    p2.ProductCode = 'B'

    LEFT JOIN #Purchase p3 ON

    p.CustomerID = p3.CustomerID AND

    p3.ProductCode = 'C'

    WHERE

    p.ProductCode = 'A' AND

    p3.CustomerID IS NULL

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== ColdCoffee ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT;

    WITH CTE AS

    (

    SELECT P.CustomerID

    , Indicator1 = CASE WHEN P.ProductCode IN('A') THEN 0

    WHEN P.ProductCode IN('C') THEN -1

    END

    , Indicator2 = CASE WHEN P.ProductCode IN('B') THEN 1

    ELSE 0

    END

    FROM #Purchase P

    WHERE P.ProductCode IN ('A','B','C')

    )

    SELECT @Bitbucket = C.CustomerID

    FROM CTE C

    GROUP BY C.CustomerID

    HAVING SUM (C.Indicator1) = 0 AND SUM(C.Indicator2) > 0

    ;

    SET STATISTICS TIME OFF;

    GO

    Against the million row test data generator that I provided in the "Resources" section of the article near the end of the article, here are the result from the code above. Arjun is the clear winner here.

    [font="Courier New"]========== Code From Article ===========================================

    SQL Server Execution Times:

    CPU time = 265 ms, elapsed time = 329 ms.

    ========== Arjun S =====================================================

    SQL Server Execution Times:

    CPU time = 78 ms, elapsed time = 100 ms.

    ========== tommyh =====================================================

    SQL Server Execution Times:

    CPU time = 329 ms, elapsed time = 331 ms.

    ========== Toby Harman =====================================================

    SQL Server Execution Times:

    CPU time = 328 ms, elapsed time = 327 ms.

    ========== James Dingle =====================================================

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 298 ms.

    ========== chintan.j.gandhi =====================================================

    SQL Server Execution Times:

    CPU time = 719 ms, elapsed time = 726 ms.

    ========== Dalibor Margotic CTE (First method already posted by Arjun S) =====

    SQL Server Execution Times:

    CPU time = 109 ms, elapsed time = 95 ms.

    ========== MAGOO =====================================================

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 92 ms.

    ========== Venoym =====================================================

    SQL Server Execution Times:

    CPU time = 359 ms, elapsed time = 362 ms.

    ========== Jonathan AC Roberts =====================================================

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 89 ms.

    ========== bob.probst =====================================================

    SQL Server Execution Times:

    CPU time = 266 ms, elapsed time = 270 ms.

    ========== Vyengr =====================================================

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 92 ms.

    ========== ejoell 66477 =====================================================

    SQL Server Execution Times:

    CPU time = 266 ms, elapsed time = 266 ms.

    ========== David Rueter =====================================================

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 275 ms.

    ========== ColdCoffee =====================================================

    SQL Server Execution Times:

    CPU time = 360 ms, elapsed time = 371 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.[/font]

    I also want to say that the SQL IQ shown by folks in this discussion is quite high (well, except for mine... I lost pretty badly considering that I wrote the article :blush:). No one resorted to RBAR of any form and the difference in most of the times are less than 250 ms in difference against a million row problem. You should all take a bow. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi, Jeff.

    Indeed. I was at work and couldn't test it properly.

    I should've noticed that the IN clause may bring results from either A OR B. 🙂

    That's an excellent article by the way.

    I use UNION ALL when needed but I've never given EXCEPT or INTERSECT the attention they deserve. 😀

    ========== Arjun S =====================================================

    SQL Server Execution Times:

    CPU time = 78 ms, elapsed time = 100 ms.

    :Wow: A nice combination of INTERSECT and EXCEPT gets the price. Congrats!

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • codebyo (3/29/2012)


    Hi, Jeff.

    Indeed. I was at work and couldn't test it properly.

    I should've noticed that the IN clause may bring results from either A OR B. 🙂

    That's an excellent article by the way.

    I use UNION ALL when needed but I've never given EXCEPT or INTERSECT the attention they deserve. 😀

    ========== Arjun S =====================================================

    SQL Server Execution Times:

    CPU time = 78 ms, elapsed time = 100 ms.

    :Wow: A nice combination of INTERSECT and EXCEPT gets the price. Congrats!

    Thanks for the feedback, Andre. I just have to get better at beating the general public at my own game. 😛

    I love this forum and this community. Like I said, I throw an idea out there and a bunch of people rise to the occasion and share their own thoughts and code... and they usually come up with improvements! To top it off, no one got snotty or nasty about it! Ya just gotta love this place.

    Thanks folks. I really appreciate it all.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, is it possible to run the test on 10 Million row table, if you have time?

  • Jeff Moden (3/29/2012)


    Thanks for the feedback, Andre. I just have to get better at beating the general public at my own game. 😛

    I love this forum and this community. Like I said, I throw an idea out there and a bunch of people rise to the occasion and share their own thoughts and code... and they usually come up with improvements! To top it off, no one got snotty or nasty about it! Ya just gotta love this place.

    Thanks folks. I really appreciate it all.

    That's probably one of the greatest discussions followed by a great article I've seen in weeks!

    I love this place because of things like this. Everyone will contribute their own ideas and expand existing ones. I thank you for starting this, as always.

    Now what I find really interesting is that queries that use EXISTS and NOT EXISTS run really fast when the data is filtered (ProductCode = 'A') prior to being passed to those operators. See MAGOO and Jonathan AC Roberts code. Something to keep in mind for sure.

    Best regards,

    Andre Guerreiro Neto

    Database Analyst
    http://www.softplan.com.br
    MCITPx1/MCTSx2/MCSE/MCSA

  • Viewing 15 posts - 46 through 60 (of 166 total)

    You must be logged in to reply to this topic. Login to reply