|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Thursday, January 17, 2013 1:46 AM
Points: 231,
Visits: 505
|
|
select * from ( select id,max(case when cd = 'A' then 1 else 0 end) + max(case when cd = 'B' then 2 else 0 end) + max(case when cd = 'C' then 4 else 0 end) status from a --where cd in ('A','B','C') group by id ) dta where 1=1 --and status = 3 -- a,b but not c --and status = 5--does have a,c but not b --and status = 6--does have b,c but not a --and status = 7 --all 3 --and status = 0 --none of them --and status = 1 -- just A --and status = 2 -- just b and status = 4 -- just C
Change the id with CustomerId ,cd with ProductCode. You are done.You can use anaytical function using parttion instead of group if you need all the columns from the table. You can use the same concept and could get the results for any combination of product codes.
GulliMeel
Finding top n Worst Performing queries Improve the performance of Merge Join(special case) How to Post Performance Problem -Gail Shaw
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 5:33 PM
Points: 32,902,
Visits: 26,783
|
|
Gullimeel (5/11/2012) select * from ( select id,max(case when cd = 'A' then 1 else 0 end) + max(case when cd = 'B' then 2 else 0 end) + max(case when cd = 'C' then 4 else 0 end) status from a --where cd in ('A','B','C') group by id ) dta where 1=1 --and status = 3 -- a,b but not c --and status = 5--does have a,c but not b --and status = 6--does have b,c but not a --and status = 7 --all 3 --and status = 0 --none of them --and status = 1 -- just A --and status = 2 -- just b and status = 4 -- just C
Change the id with CustomerId ,cd with ProductCode. You are done.You can use anaytical function using parttion instead of group if you need all the columns from the table. You can use the same concept and could get the results for any combination of product codes.
Ummm... that's real nice but what builds the status column? Some magical analytical function that you haven't shown us?
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Thursday, January 17, 2013 1:46 AM
Points: 231,
Visits: 505
|
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Thursday, January 17, 2013 1:46 AM
Points: 231,
Visits: 505
|
|
Below is the sql to show how much cpu time and logical IO's it takes to find the customers who bought A,B but not C.
Also, I have done a performance comparison when you want to find the details from Purchase table for the customer who bought A, B but not C.
The method using parttion by which eliminates join with purchase column is way too expensive...
Thus before using any new method benchmark it against existing best solution...
Just to make sure which method is better..Follow one rule..
Benchmark benchmark Benchmark.....
use tempdb go /* --uncomment this to create purchase table.. This script to generate the 1 million rows is --created by Jeff Moden on sql server central . com
--===== This code takes about 23 seconds to run on a 10 year old, -- single CPU desktop machine.
--===== Conditionally drop the test table to make reruns in SSMS easier. -- This is NOT a part of the solution. IF OBJECT_ID('tempdb..Purchase','U') IS NOT NULL DROP TABLE Purchase ; --===== Create and populate the test table. -- This is NOT a part of the solution. SELECT TOP (1000000) PurchaseID = IDENTITY(INT,1,1), CustomerID = ABS(CHECKSUM(NEWID())) % 50000 + 1, ProductCode = CHAR(ABS(CHECKSUM(NEWID())) % 26 + 65) INTO Purchase FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2 ; --===== Add the expected PK ALTER TABLE Purchase ADD PRIMARY KEY CLUSTERED (PurchaseID) ; --===== Can you guess which index will be used? CREATE INDEX IX_Purchase_CustomerID_ProductCode ON Purchase (CustomerID, ProductCode) ; CREATE INDEX IX_Purchase_ProductCode_CustomerID ON Purchase (ProductCode, CustomerID) update statistics Purchase with fullscan */ --- find only the customer ids who bought a,b but not c with cte1 as ( select CustomerId ,MAX(case when ProductCode = 'A' then 1 else 0 end) + max(case when ProductCode = 'B' then 2 else 0 end) + max(case when ProductCode = 'C' then 4 else 0 end ) as status from Purchase group by CustomerID ) select * from cte1 where status = 3
--Above uses a non clustered index scan and clocked 1867 logical IO and around 800 ms cpu and elapsed time.
--query to get the details of customers along with product code and using join back to the original table.. with cte1 as ( select CustomerId ,MAX(case when ProductCode = 'A' then 1 else 0 end) + max(case when ProductCode = 'B' then 2 else 0 end) + max(case when ProductCode = 'C' then 4 else 0 end ) as status from Purchase group by CustomerID ) select p.CustomerId,p.ProductCode,cte1.status from cte1 inner join purchase p on cte1.CustomerId = p.CustomerId where status = 3 and p.ProductCode not in ('C'); --as we
--It clocked 3734 Io's and 1200 ms CPU time and 1500 ms of elapsed time for 137 K rows..This is quite fast
--This is using without join but using parttion by ..This is performaing badly.. --Thus do not try the new methods just to make the statement simple.. --Always benchmark with the best method you have with any new method which is introduced in the new versions.. --It is clocking 2.2 million IO's approx...Which is way too high 7533 ms CPU time... and 2853 ms elapsed time for 137 k rows with cte1 as ( select CustomerID,ProductCode , case when ProductCode = 'A' then 1 else 0 end as IsAExist , case when ProductCode = 'B' then 2 else 0 end as IsBExist , case when ProductCode = 'C' then 4 else 0 end as IsCExist from Purchase ),cte2 as ( select CustomerID,ProductCode, MAX(IsAExist) over(PARTITION by CustomerID) + MAX(IsBExist)over(PARTITION by CustomerID) + MAX(IsCExist)over(PARTITION by CustomerID) as status from cte1 ) select * from cte2 where status = 3 --option(maxdop 1) ; --a and b but not C go
--
/* To compare the above... without using parttition
Logical IO CPU Time Elapsed Time 3734 800ms 800ms 2201871 7500ms 2850ms */
GulliMeel
Finding top n Worst Performing queries Improve the performance of Merge Join(special case) How to Post Performance Problem -Gail Shaw
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: Yesterday @ 2:02 PM
Points: 6,367,
Visits: 8,228
|
|
|
|
|
|
SSC-Addicted
      
Group: General Forum Members
Last Login: Today @ 4:06 AM
Points: 431,
Visits: 184
|
|
Another way to skin the cat? (results are for the million-row table provided in the article)
(apologies if someone already posted this method)
--SET STATISTICS IO ON --SET STATISTICS TIME ON
SELECT CustomerID FROM #Purchase WHERE ProductCode IN('A','B','C') -- only A,B,C, AB,AC,BC and ABC GROUP BY CustomerID HAVING COUNT(DISTINCT ProductCode) = 2 -- only AB,AC and BC AND MAX(ProductCode) = 'B' -- only AB! :)
/* (6763 row(s) affected) Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#Purchase'. Scan count 3, logical reads 226, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 187 ms, elapsed time = 187 ms. */
|
|
|
|