|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 10:37 AM
Points: 32,893,
Visits: 26,769
|
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, March 18, 2013 3:03 PM
Points: 40,
Visits: 303
|
|
@Jeff
Did you compare the speed of this query against one where your first (Group By) From clause is a Select Distinct (sub) query?
Nice article. It builds nicely for the reader.
I love seeing the Except clause being introduced to the reader who, like me, cut our teeth on SQL92 and need to learn newer language features to be more productive.
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Sunday, March 31, 2013 11:03 AM
Points: 229,
Visits: 201
|
|
Another way to get the result would be to use Intersect and then combine it with Except. The distinct part is handled implicitly.
--===== Find Customers that bought both "A" AND "B" SELECT CustomerID FROM #Purchase WHERE ProductCode IN ('A') INTERSECT SELECT CustomerID FROM #Purchase WHERE ProductCode IN ('B') EXCEPT --===== Find Customers that bought "C". SELECT CustomerID FROM #Purchase WHERE ProductCode IN ('C') ;
|
|
|
|
|
UDP Broadcaster
      
Group: General Forum Members
Last Login: Friday, May 10, 2013 3:15 AM
Points: 1,476,
Visits: 1,943
|
|
Wouldnt something like this work just as well?
SELECT CustomerId FROM #Purchase WHERE ProductCode IN ('A','B', 'C') GROUP BY CustomerID having sum(case when ProductCode = 'A' then 1 else 0 end) > 0 and sum(case when ProductCode = 'B' then 1 else 0 end) > 0 and sum(case when ProductCode = 'C' then 1 else 0 end) = 0
/T
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Wednesday, May 01, 2013 4:37 PM
Points: 2,248,
Visits: 5,352
|
|
tommyh (3/28/2012)
Wouldnt something like this work just as well? SELECT CustomerId FROM #Purchase WHERE ProductCode IN ('A','B', 'C') GROUP BY CustomerID having sum(case when ProductCode = 'A' then 1 else 0 end) > 0 and sum(case when ProductCode = 'B' then 1 else 0 end) > 0 and sum(case when ProductCode = 'C' then 1 else 0 end) = 0
/T Yep, this is how i once implemented it! Cant remember the thread now.
Here it is: http://www.sqlservercentral.com/Forums/FindPost1267224.aspx
Proved to elimate the EXCEPT part; also does a single scan on the table.
|
|
|
|
|
SSCrazy Eights
        
Group: General Forum Members
Last Login: 2 days ago @ 6:54 AM
Points: 9,364,
Visits: 6,462
|
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Wednesday, May 01, 2013 4:37 PM
Points: 2,248,
Visits: 5,352
|
|
| OH yeah, Nice one Jeff. As always! Thanks a lot for taking time to teach us some of the tricks of the game. Thaks a lot :)
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 3:21 AM
Points: 322,
Visits: 475
|
|
Had a play with this and was very impressed - nice job.
I played around and added this index
CREATE INDEX IX_#Purchase_ProductCode ON #Purchase ( ProductCode) INCLUDE (CustomerID)
One concern I had was looking at the query plan, the EXCEPT does a clustered index scan. That may be a function of the small data set, but I am not sure.
My preference for these types of queries has always been to do an OUTER JOIN on the one we don't want selected and then say that we only want the ones that didn't match returned.
SELECT a.CustomerID FROM #Purchase a LEFT OUTER JOIN #Purchase b ON b.CustomerID = a.CustomerID AND b.ProductCode IN ('C') WHERE a.ProductCode IN ('A','B') AND b.CustomerID IS NULL GROUP BY a.CustomerID HAVING COUNT(DISTINCT a.ProductCode) = 2
Any thoughts?
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Thursday, April 26, 2012 5:28 PM
Points: 14,
Visits: 72
|
|
"Left Join Where Is Null" gets complicated to read/understand and is subject to cardinalities issues that will impact performance if the right set has many records for 1 record in the left hand side.
There is also this way, probably the closest to natural language.
Select Distinct CustomerID From #Purchase Where ProductCode = 'A' And CustomerID In ( Select CustomerID From #Purchase Where ProductCode = 'B' ) And CustomerID Not In ( Select CustomerID From #Purchase Where ProductCode = 'C' )
I would be interested to have this table filled by 100,000+ rows of sample data and then compare execution plans and statistics of the proposed approaches. Comparing syntax without an idea of the performance signature is a bit useless IMHO.
So here is another version of the initialization script from which the difference will be more blatant:
--===== Conditionally drop the test table to make -- reruns in SSMS easier. IF OBJECT_ID('tempdb..#Purchase','U') IS NOT NULL DROP TABLE #Purchase ; --===== Create the test table CREATE TABLE #Purchase ( PurchaseID INT IDENTITY(1,1), CustomerID INT, ProductCode VARCHAR(1000) PRIMARY KEY CLUSTERED (PurchaseID) ) ;
--===== Populate the test table with known data. INSERT INTO #Purchase (CustomerID, ProductCode) ------- Customer #1 precisely meets the criteria. -- Bought 'A' and 'B' but not 'C'. SELECT 1, 'A' UNION ALL SELECT 1, 'B' UNION ALL ------- Customer #2 also meets the criteria. -- Bought 'A' and 'B' and somthing else, -- but not 'C'. SELECT 2, 'A' UNION ALL SELECT 2, 'B' UNION ALL SELECT 2, 'D' UNION ALL ------- Customer #3 also meets the criteria. -- Bought 'A' and 'B' and something else, -- but not 'C'. SELECT 3, 'A' UNION ALL SELECT 3, 'B' UNION ALL SELECT 3, 'D' UNION ALL SELECT 3, 'A' UNION ALL SELECT 3, 'D' UNION ALL ------- Customer #4 doesn't meet the criteria. -- Bought 'A' and 'B' but also bought 'C'. SELECT 4, 'A' UNION ALL SELECT 4, 'B' UNION ALL SELECT 4, 'C' UNION ALL ------- Customer #5 doesn't meet the criteria. -- Bought 'A' and 'B' and something else, -- but also bought 'C'. SELECT 5, 'A' UNION ALL SELECT 5, 'B' UNION ALL SELECT 5, 'A' UNION ALL SELECT 5, 'B' UNION ALL SELECT 5, 'C' UNION ALL SELECT 5, 'D' UNION ALL ------- Customer #6 doesn't meet the criteria. -- Bought more than 1 of 'A' and something else -- but not 'B'. SELECT 6, 'A' UNION ALL SELECT 6, 'A' UNION ALL SELECT 6, 'D' UNION ALL SELECT 6, 'E' UNION ALL ------- Customer #7 doesn't meet the criteria. -- Bought more than 1 of 'B' and something else -- but not 'A'. SELECT 7, 'B' UNION ALL SELECT 7, 'B' UNION ALL SELECT 7, 'D' UNION ALL SELECT 7, 'E'
go
declare @i int = 10; while @i < 100000 begin insert into #Purchase (CustomerID, ProductCode) select CustomerID + @i, ProductCode + Left(ProductCode, 1) From #Purchase set @i = @i * 2 end Then turn on statistics or open your SQL profiler
SET STATISTICS IO ON And now you're ready.
Then you will also need an index on ProductCode.
Create Index IX_ProductCode On #Purchase (ProductCode) Include (CustomerID);
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Friday, January 25, 2013 3:25 AM
Points: 22,
Visits: 51
|
|
How about this? Although i agree query might get longer when there would be more than 3 items
SELECT DISTINCT CustomerID FROM #Purchase P WHERE EXISTS (SELECT 1 FROM #Purchase P1 WHERE P.customerid = P1.customerid AND P1.productcode = 'A' ) AND EXISTS (SELECT 1 FROM #Purchase P2 WHERE P.customerid = P2.customerid AND P2.productcode = 'B' ) AND NOT EXISTS (SELECT 1 FROM #Purchase P3 WHERE P.customerid = P3.customerid AND P3.productcode = 'C' )
|
|
|
|