• Heh... Ok... first, I have to say that I'm amazed. This particular article got twice as many reads in 1/4 the time as the "Generating test data" article did just this previous Monday.

    I'll also say thank you all very much for participating in the discussion and for taking the time to post code and/or provide feedback. What's really cool is that I come up with a simple idea and good folks like yourselves try to make it better and usually succeed. Several of you posted code that runs significantly faster than mine with Arjun S coming up with the fastest (at least on my ol' war horse of a desktop box).

    Here's how I tested all of the solutions what worked according to a separate test against the original data provided in the article. I 'threw away" the results in a variable to take the display out of the picture without throwing away the print statements or rowcounts.

    PRINT '========== Code From Article ==========================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('A','B')

    GROUP BY CustomerID

    HAVING COUNT(DISTINCT ProductCode) = 2

    EXCEPT

    --===== Find Customers that bought "C".

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('C')

    ) d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Arjun S ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('A')

    INTERSECT

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('B')

    EXCEPT

    --===== Find Customers that bought "C".

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode IN ('C')

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== tommyh ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT

    CustomerId

    FROM #Purchase

    WHERE ProductCode IN ('A','B', 'C')

    GROUP BY CustomerID

    having sum(case when ProductCode = 'A' then 1 else 0 end) > 0

    and sum(case when ProductCode = 'B' then 1 else 0 end) > 0

    and sum(case when ProductCode = 'C' then 1 else 0 end) = 0

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Toby Harman ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT a.CustomerID

    FROM #Purchase a

    LEFT OUTER JOIN #Purchase b

    ON b.CustomerID = a.CustomerID

    AND b.ProductCode IN ('C')

    WHERE a.ProductCode IN ('A','B')

    AND b.CustomerID IS NULL

    GROUP BY a.CustomerID

    HAVING COUNT(DISTINCT a.ProductCode) = 2

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== James Dingle ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SelectDistinct

    CustomerID

    From#Purchase

    WhereProductCode = 'A'

    AndCustomerID In

    (

    SelectCustomerID

    From#Purchase

    WhereProductCode = 'B'

    )

    AndCustomerID Not In

    (

    SelectCustomerID

    From#Purchase

    WhereProductCode = 'C'

    )

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== chintan.j.gandhi ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT DISTINCT CustomerID

    FROM #Purchase P

    WHERE EXISTS (SELECT 1 FROM #Purchase P1

    WHERE P.customerid = P1.customerid

    AND P1.productcode = 'A'

    )

    AND EXISTS (SELECT 1 FROM #Purchase P2

    WHERE P.customerid = P2.customerid

    AND P2.productcode = 'B'

    )

    AND NOT EXISTS (SELECT 1 FROM #Purchase P3

    WHERE P.customerid = P3.customerid

    AND P3.productcode = 'C'

    )

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Dalibor Margotic CTE (First method already posted by Arjun S) ====='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT;

    WITH CTE AS

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode = 'A'

    GROUP BY CustomerID

    )

    ,CTE2 AS

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode = 'B'

    GROUP BY CustomerID

    )

    ,CTE3 AS

    (

    SELECT CustomerID

    FROM #Purchase

    WHERE ProductCode = 'C'

    GROUP BY CustomerID

    )

    select @Bitbucket = CTE.CustomerID from CTE

    left join CTE3 ON

    CTE.CustomerID = CTE3.CustomerID

    where CTE.CustomerID in (SELECT CustomerID FROM CTE2)

    AND CTE3.CustomerID IS NULL

    GROUP BY CTE.CustomerID

    ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== MAGOO ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT DISTINCT

    CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'A'

    AND EXISTS (SELECT

    1

    FROM #Purchase p2

    WHERE p2.CustomerID = p1.CustomerID

    AND p2.ProductCode = 'B')

    AND NOT EXISTS (SELECT

    1

    FROM #Purchase p3

    WHERE p3.CustomerID = p1.CustomerID

    AND p3.ProductCode = 'C')

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Venoym ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT

    b.CustomerID

    FROM

    (

    SELECT

    a.CustomerID

    FROM

    (

    SELECT

    DISTINCT

    CustomerID,

    ProductCode

    FROM

    #Purchase

    WHERE

    ProductCode IN ('A','B')) AS a

    GROUP BY

    a.CustomerID

    HAVING

    COUNT(ProductCode) = 2

    ) AS b LEFT OUTER JOIN

    #Purchase p ON b.CustomerID = p.CustomerID AND p.ProductCode = 'C'

    WHERE

    p.ProductCode IS NULL

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Jonathan AC Roberts ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT DISTINCT CustomerID

    FROM #Purchase p1

    WHERE ProductCode = 'A'

    AND EXISTS(SELECT 1

    FROM #Purchase p2

    WHERE p2.CustomerID = p1.CustomerID

    AND p2.ProductCode = 'B'

    AND NOT EXISTS (SELECT 1

    FROM #Purchase p3

    WHERE p3.CustomerID = p2.CustomerID

    AND p3.ProductCode = 'C'))

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== bob.probst ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    select distinct a.customerid

    from #purchase a

    inner join #purchase b on (a.customerid = b.customerid)

    left join #purchase c on (a.customerid = c.customerid and c.productcode = 'C')

    where a.productcode = 'A'

    and b.productcode = 'B'

    and c.purchaseid is null

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== Vyengr ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    Select Distinct CustomerID

    FROM #Purchase PU

    WHERE ProductCode ='A'

    AND EXISTS (Select CustomerID

    FROM #Purchase

    WHERE ProductCode ='B' and CustomerID = PU.CustomerID)

    AND NOT EXISTS (Select PU.CustomerID

    FROM #Purchase

    WHERE ProductCode ='C' and CustomerID = PU.CustomerID)

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== ejoell 66477 ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    Select Distinct ABuyers.CustomerID from

    (Select CustomerID FROM #Purchase WHERE ProductCode = 'A') ABuyers

    JOIN

    (Select CustomerID FROM #Purchase WHERE ProductCode = 'B') BBuyers

    ON ABuyers.CustomerID = BBuyers.CustomerID

    LEFT JOIN

    (Select CustomerID FROM #Purchase WHERE ProductCode = 'C') CBuyers

    ON ABuyers.CustomerID = CBuyers.CustomerID

    WHERE CBuyers.CustomerID is null

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== David Rueter ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT

    SELECT @Bitbucket = CustomerID FROM

    (

    SELECT DISTINCT p.CustomerID

    FROM

    #Purchase p

    JOIN #Purchase p2 ON

    p.CustomerID = p2.CustomerID AND

    p2.ProductCode = 'B'

    LEFT JOIN #Purchase p3 ON

    p.CustomerID = p3.CustomerID AND

    p3.ProductCode = 'C'

    WHERE

    p.ProductCode = 'A' AND

    p3.CustomerID IS NULL

    )d ;

    SET STATISTICS TIME OFF;

    GO

    PRINT '========== ColdCoffee ====================================================='

    SET STATISTICS TIME ON;

    DECLARE @Bitbucket INT;

    WITH CTE AS

    (

    SELECT P.CustomerID

    , Indicator1 = CASE WHEN P.ProductCode IN('A') THEN 0

    WHEN P.ProductCode IN('C') THEN -1

    END

    , Indicator2 = CASE WHEN P.ProductCode IN('B') THEN 1

    ELSE 0

    END

    FROM #Purchase P

    WHERE P.ProductCode IN ('A','B','C')

    )

    SELECT @Bitbucket = C.CustomerID

    FROM CTE C

    GROUP BY C.CustomerID

    HAVING SUM (C.Indicator1) = 0 AND SUM(C.Indicator2) > 0

    ;

    SET STATISTICS TIME OFF;

    GO

    Against the million row test data generator that I provided in the "Resources" section of the article near the end of the article, here are the result from the code above. Arjun is the clear winner here.

    [font="Courier New"]========== Code From Article ===========================================

    SQL Server Execution Times:

    CPU time = 265 ms, elapsed time = 329 ms.

    ========== Arjun S =====================================================

    SQL Server Execution Times:

    CPU time = 78 ms, elapsed time = 100 ms.

    ========== tommyh =====================================================

    SQL Server Execution Times:

    CPU time = 329 ms, elapsed time = 331 ms.

    ========== Toby Harman =====================================================

    SQL Server Execution Times:

    CPU time = 328 ms, elapsed time = 327 ms.

    ========== James Dingle =====================================================

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 298 ms.

    ========== chintan.j.gandhi =====================================================

    SQL Server Execution Times:

    CPU time = 719 ms, elapsed time = 726 ms.

    ========== Dalibor Margotic CTE (First method already posted by Arjun S) =====

    SQL Server Execution Times:

    CPU time = 109 ms, elapsed time = 95 ms.

    ========== MAGOO =====================================================

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 92 ms.

    ========== Venoym =====================================================

    SQL Server Execution Times:

    CPU time = 359 ms, elapsed time = 362 ms.

    ========== Jonathan AC Roberts =====================================================

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 89 ms.

    ========== bob.probst =====================================================

    SQL Server Execution Times:

    CPU time = 266 ms, elapsed time = 270 ms.

    ========== Vyengr =====================================================

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 92 ms.

    ========== ejoell 66477 =====================================================

    SQL Server Execution Times:

    CPU time = 266 ms, elapsed time = 266 ms.

    ========== David Rueter =====================================================

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 275 ms.

    ========== ColdCoffee =====================================================

    SQL Server Execution Times:

    CPU time = 360 ms, elapsed time = 371 ms.

    Warning: Null value is eliminated by an aggregate or other SET operation.[/font]

    I also want to say that the SQL IQ shown by folks in this discussion is quite high (well, except for mine... I lost pretty badly considering that I wrote the article :blush:). No one resorted to RBAR of any form and the difference in most of the times are less than 250 ms in difference against a million row problem. You should all take a bow. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)