Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Find Customers Who Bought "A" and "B" But Not "C" (SQL Spackle)


Find Customers Who Bought "A" and "B" But Not "C" (SQL Spackle)

Author
Message
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45002 Visits: 39880
Dean Cochrane (3/30/2012)
Berl (3/30/2012)

Although CPU time difference is not very great, the difference is much bigger for the elapsed time.
[b]


Elapsed time includes wait time and is not a good measure of performance.


Actually, it is. The wait time you speak of also includes I/O time and that can be a major part of a performance problem. I do agree, though, that you really need to take "time to display" out of the picture because it will severly mask problems. It takes a certain amount of time to display a million rows and can make differences between two methods seem much more trivial than they might be.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Berl
Berl
SSC Veteran
SSC Veteran (229 reputation)SSC Veteran (229 reputation)SSC Veteran (229 reputation)SSC Veteran (229 reputation)SSC Veteran (229 reputation)SSC Veteran (229 reputation)SSC Veteran (229 reputation)SSC Veteran (229 reputation)

Group: General Forum Members
Points: 229 Visits: 47
Thank you Jeff.
michal.lisinski
michal.lisinski
SSChasing Mays
SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)

Group: General Forum Members
Points: 636 Visits: 1090
Imho

SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)


Regards
Mike
Jonathan AC Roberts
Jonathan AC Roberts
SSC-Addicted
SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)

Group: General Forum Members
Points: 430 Visits: 1836
michal.lisinski (4/2/2012)
Imho

SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)


Regards
Mike

This also will select customers who have purchased just one of 'A' or 'B' and not purchased 'C'. the requirement was that they should have purchased both 'A' and 'B' and not purchased 'C'.
Antares686
Antares686
SSCrazy Eights
SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)SSCrazy Eights (8.4K reputation)

Group: Moderators
Points: 8424 Visits: 780
Jeff Moden (3/31/2012)
Dean Cochrane (3/30/2012)
Berl (3/30/2012)

Although CPU time difference is not very great, the difference is much bigger for the elapsed time.
[b]


Elapsed time includes wait time and is not a good measure of performance.


Actually, it is. The wait time you speak of also includes I/O time and that can be a major part of a performance problem. I do agree, though, that you really need to take "time to display" out of the picture because it will severly mask problems. It takes a certain amount of time to display a million rows and can make differences between two methods seem much more trivial than they might be.

Well depends, the elapsed time can an ok performance indicator, however when you run the queries for testing back to back in the same batch without clearing the buffers you may get a false sense of performance from elapsed time and many other stats. I have a local test SQL Server which I always test by stopping the instance then starting so I can be sure everything is clear for testing. But the query plans, so I would be carefull in saying one scenario over another is truely best if you compare in a single batch. Now all that said, a huge difference in elapsed time shows a performance change and when I say huge took 30minutes to run first time then 30 seconds with second query.



michal.lisinski
michal.lisinski
SSChasing Mays
SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)

Group: General Forum Members
Points: 636 Visits: 1090
Hmm, r u sure? Customer no 6 took product A and no B, customer 7 took product B and no A, both didn't take C, but query doesn't return them.
Regards
Mike
michal.lisinski
michal.lisinski
SSChasing Mays
SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)

Group: General Forum Members
Points: 636 Visits: 1090
Jonathan AC Roberts (4/2/2012)
michal.lisinski (4/2/2012)
Imho

SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)


Regards
Mike

This also will select customers who have purchased just one of 'A' or 'B' and not purchased 'C'. the requirement was that they should have purchased both 'A' and 'B' and not purchased 'C'.

michal.lisinski
michal.lisinski
SSChasing Mays
SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)SSChasing Mays (636 reputation)

Group: General Forum Members
Points: 636 Visits: 1090
Jonathan AC Roberts (4/2/2012)
michal.lisinski (4/2/2012)
Imho

SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)


Regards
Mike

This also will select customers who have purchased just one of 'A' or 'B' and not purchased 'C'. the requirement was that they should have purchased both 'A' and 'B' and not purchased 'C'.



Hmm, r u sure? Customer no 6 took product A and no B, customer 7 took product B and no A, both didn't take C, but query doesn't return them.
Regards
Mike
Jonathan AC Roberts
Jonathan AC Roberts
SSC-Addicted
SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)SSC-Addicted (430 reputation)

Group: General Forum Members
Points: 430 Visits: 1836
michal.lisinski (4/2/2012)
Jonathan AC Roberts (4/2/2012)
michal.lisinski (4/2/2012)
Imho

SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)


Regards
Mike

This also will select customers who have purchased just one of 'A' or 'B' and not purchased 'C'. the requirement was that they should have purchased both 'A' and 'B' and not purchased 'C'.



Hmm, r u sure? Customer no 6 took product A and no B, customer 7 took product B and no A, both didn't take C, but query doesn't return them.
Regards
Mike


Hi Mike,
I tried the following on your code:
;WITH [#Purchase] AS
(
SELECT 'A' ProductCode, 1 CustomerId UNION ALL
SELECT 'B' ProductCode, 1 CustomerId UNION ALL
SELECT 'A' ProductCode, 2 CustomerId UNION ALL
SELECT 'B' ProductCode, 3 CustomerId UNION ALL
SELECT 'A' ProductCode, 4 CustomerId UNION ALL
SELECT 'A' ProductCode, 4 CustomerId UNION ALL
SELECT 'A' ProductCode, 5 CustomerId UNION ALL
SELECT 'B' ProductCode, 5 CustomerId UNION ALL
SELECT 'C' ProductCode, 5 CustomerId
)
SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode IN ('A','B')) AND (CustomerID NOT IN
(SELECT DISTINCT CustomerID
FROM [#Purchase]
WHERE (ProductCode = 'C')))
GROUP BY CustomerID, ProductCode
HAVING (COUNT(CustomerID) = 1)



Customer 1 has bought 'A' and 'B' and not 'C' and he gets found correctly
Customer 2 has bought 'A' and not 'B' and not 'C' and he gets found incorrectly
Customer 3 has bought not 'A' and 'B' and not 'C' and he gets found incorrectly
Customer 4 has bought two 'A' and not 'B' and not 'C' and he doesn't get found correctly
Customer 5 has bought 'A' and 'B' and 'C' and he doesn't get found correctly
Gullimeel
Gullimeel
SSC Veteran
SSC Veteran (235 reputation)SSC Veteran (235 reputation)SSC Veteran (235 reputation)SSC Veteran (235 reputation)SSC Veteran (235 reputation)SSC Veteran (235 reputation)SSC Veteran (235 reputation)SSC Veteran (235 reputation)

Group: General Forum Members
Points: 235 Visits: 506
select * from
(
select id,max(case when cd = 'A' then 1 else 0 end) + max(case when cd = 'B' then 2 else 0 end) + max(case when cd = 'C' then 4 else 0 end) status from a
--where cd in ('A','B','C')
group by id
) dta
where
1=1
--and status = 3 -- a,b but not c
--and status = 5--does have a,c but not b
--and status = 6--does have b,c but not a
--and status = 7 --all 3
--and status = 0 --none of them
--and status = 1 -- just A
--and status = 2 -- just b
and status = 4 -- just C


Replace Id with CustomerId and cd with ProductCode. You can generalize the code the way you want..You can use analytic function instead of group by as well.

GulliMeel

Finding top n Worst Performing queries
Improve the performance of Merge Join(special case)
How to Post Performance Problem -Gail Shaw
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search