Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Find Customers Who Bought "A" and "B" But Not "C" (SQL Spackle)


Find Customers Who Bought "A" and "B" But Not "C" (SQL Spackle)

Author
Message
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45241 Visits: 39927
Comments posted to this topic are about the item Find Customers Who Bought "A" and "B" But Not "C" (SQL Spackle)

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
mark hutchinson
mark hutchinson
Valued Member
Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)

Group: General Forum Members
Points: 58 Visits: 446
@Jeff

Did you compare the speed of this query against one where your first (Group By) From clause is a Select Distinct (sub) query?

Nice article. It builds nicely for the reader.

I love seeing the Except clause being introduced to the reader who, like me, cut our teeth on SQL92 and need to learn newer language features to be more productive.



Arjun S
Arjun S
SSC Veteran
SSC Veteran (231 reputation)SSC Veteran (231 reputation)SSC Veteran (231 reputation)SSC Veteran (231 reputation)SSC Veteran (231 reputation)SSC Veteran (231 reputation)SSC Veteran (231 reputation)SSC Veteran (231 reputation)

Group: General Forum Members
Points: 231 Visits: 231
Another way to get the result would be to use Intersect and then combine it with Except. The distinct part is handled implicitly.


--===== Find Customers that bought both "A" AND "B"
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('A')
INTERSECT
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('B')
EXCEPT
--===== Find Customers that bought "C".
SELECT CustomerID
FROM #Purchase
WHERE ProductCode IN ('C')
;


tommyh
tommyh
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1560 Visits: 2000
Wouldnt something like this work just as well?

SELECT
CustomerId
FROM #Purchase
WHERE ProductCode IN ('A','B', 'C')
GROUP BY CustomerID
having sum(case when ProductCode = 'A' then 1 else 0 end) > 0
and sum(case when ProductCode = 'B' then 1 else 0 end) > 0
and sum(case when ProductCode = 'C' then 1 else 0 end) = 0



/T
ColdCoffee
ColdCoffee
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2317 Visits: 5545
tommyh (3/28/2012)
Wouldnt something like this work just as well?

SELECT
CustomerId
FROM #Purchase
WHERE ProductCode IN ('A','B', 'C')
GROUP BY CustomerID
having sum(case when ProductCode = 'A' then 1 else 0 end) > 0
and sum(case when ProductCode = 'B' then 1 else 0 end) > 0
and sum(case when ProductCode = 'C' then 1 else 0 end) = 0



/T

Yep, this is how i once implemented it! Cant remember the thread now.

Here it is: http://www.sqlservercentral.com/Forums/FindPost1267224.aspx

Proved to elimate the EXCEPT part; also does a single scan on the table.
Koen Verbeeck
Koen Verbeeck
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16471 Visits: 13207
Nice spackle Jeff, and great alternatives in the discussion!



How to post forum questions.
Need an answer? No, you need a question.
What’s the deal with Excel & SSIS?

Member of LinkedIn. My blog at SQLKover.

MCSA SQL Server 2012 - MCSE Business Intelligence
ColdCoffee
ColdCoffee
SSCrazy
SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)SSCrazy (2.3K reputation)

Group: General Forum Members
Points: 2317 Visits: 5545
OH yeah, Nice one Jeff. As always! Thanks a lot for taking time to teach us some of the tricks of the game. Thaks a lot Smile
Toby Harman
Toby Harman
SSC-Addicted
SSC-Addicted (485 reputation)SSC-Addicted (485 reputation)SSC-Addicted (485 reputation)SSC-Addicted (485 reputation)SSC-Addicted (485 reputation)SSC-Addicted (485 reputation)SSC-Addicted (485 reputation)SSC-Addicted (485 reputation)

Group: General Forum Members
Points: 485 Visits: 668
Had a play with this and was very impressed - nice job.

I played around and added this index

CREATE INDEX IX_#Purchase_ProductCode
ON #Purchase (
ProductCode)
INCLUDE (CustomerID)



One concern I had was looking at the query plan, the EXCEPT does a clustered index scan. That may be a function of the small data set, but I am not sure.

My preference for these types of queries has always been to do an OUTER JOIN on the one we don't want selected and then say that we only want the ones that didn't match returned.


SELECT a.CustomerID
FROM #Purchase a
LEFT OUTER JOIN #Purchase b
ON b.CustomerID = a.CustomerID
AND b.ProductCode IN ('C')
WHERE a.ProductCode IN ('A','B')
AND b.CustomerID IS NULL
GROUP BY a.CustomerID
HAVING COUNT(DISTINCT a.ProductCode) = 2



Any thoughts?
James Dingle-651585
James Dingle-651585
Grasshopper
Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)

Group: General Forum Members
Points: 16 Visits: 73
"Left Join Where Is Null" gets complicated to read/understand and is subject to cardinalities issues that will impact performance if the right set has many records for 1 record in the left hand side.

There is also this way, probably the closest to natural language.

Select   Distinct
CustomerID
From #Purchase
Where ProductCode = 'A'
And CustomerID In
(
Select CustomerID
From #Purchase
Where ProductCode = 'B'
)
And CustomerID Not In
(
Select CustomerID
From #Purchase
Where ProductCode = 'C'
)





I would be interested to have this table filled by 100,000+ rows of sample data and then compare execution plans and statistics of the proposed approaches.
Comparing syntax without an idea of the performance signature is a bit useless IMHO.

So here is another version of the initialization script from which the difference will be more blatant:
--===== Conditionally drop the test table to make
-- reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Purchase','U') IS NOT NULL
DROP TABLE #Purchase
;
--===== Create the test table
CREATE TABLE #Purchase
(
PurchaseID INT IDENTITY(1,1),
CustomerID INT,
ProductCode VARCHAR(1000)
PRIMARY KEY CLUSTERED (PurchaseID)
)
;

--===== Populate the test table with known data.
INSERT INTO #Purchase
(CustomerID, ProductCode)
------- Customer #1 precisely meets the criteria.
-- Bought 'A' and 'B' but not 'C'.
SELECT 1, 'A' UNION ALL
SELECT 1, 'B' UNION ALL
------- Customer #2 also meets the criteria.
-- Bought 'A' and 'B' and somthing else,
-- but not 'C'.
SELECT 2, 'A' UNION ALL
SELECT 2, 'B' UNION ALL
SELECT 2, 'D' UNION ALL
------- Customer #3 also meets the criteria.
-- Bought 'A' and 'B' and something else,
-- but not 'C'.
SELECT 3, 'A' UNION ALL
SELECT 3, 'B' UNION ALL
SELECT 3, 'D' UNION ALL
SELECT 3, 'A' UNION ALL
SELECT 3, 'D' UNION ALL
------- Customer #4 doesn't meet the criteria.
-- Bought 'A' and 'B' but also bought 'C'.
SELECT 4, 'A' UNION ALL
SELECT 4, 'B' UNION ALL
SELECT 4, 'C' UNION ALL
------- Customer #5 doesn't meet the criteria.
-- Bought 'A' and 'B' and something else,
-- but also bought 'C'.
SELECT 5, 'A' UNION ALL
SELECT 5, 'B' UNION ALL
SELECT 5, 'A' UNION ALL
SELECT 5, 'B' UNION ALL
SELECT 5, 'C' UNION ALL
SELECT 5, 'D' UNION ALL
------- Customer #6 doesn't meet the criteria.
-- Bought more than 1 of 'A' and something else
-- but not 'B'.
SELECT 6, 'A' UNION ALL
SELECT 6, 'A' UNION ALL
SELECT 6, 'D' UNION ALL
SELECT 6, 'E' UNION ALL
------- Customer #7 doesn't meet the criteria.
-- Bought more than 1 of 'B' and something else
-- but not 'A'.
SELECT 7, 'B' UNION ALL
SELECT 7, 'B' UNION ALL
SELECT 7, 'D' UNION ALL
SELECT 7, 'E'

go

declare @i int = 10;
while @i < 100000
begin
insert into #Purchase (CustomerID, ProductCode)
select CustomerID + @i, ProductCode + Left(ProductCode, 1)
From #Purchase
set @i = @i * 2
end



Then turn on statistics or open your SQL profiler
SET STATISTICS IO ON



And now you're ready.

Then you will also need an index on ProductCode.
Create Index IX_ProductCode On #Purchase (ProductCode) Include (CustomerID);


chintan.j.gandhi
chintan.j.gandhi
SSC Rookie
SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)SSC Rookie (29 reputation)

Group: General Forum Members
Points: 29 Visits: 52
How about this? Although i agree query might get longer when there would be more than 3 items


SELECT DISTINCT CustomerID
FROM #Purchase P
WHERE EXISTS (SELECT 1 FROM #Purchase P1
WHERE P.customerid = P1.customerid
AND P1.productcode = 'A'
)
AND EXISTS (SELECT 1 FROM #Purchase P2
WHERE P.customerid = P2.customerid
AND P2.productcode = 'B'
)
AND NOT EXISTS (SELECT 1 FROM #Purchase P3
WHERE P.customerid = P3.customerid
AND P3.productcode = 'C'
)


Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search