|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Thursday, December 13, 2012 6:34 PM
Points: 6,
Visits: 58
|
|
Problem: Bank requires to find out number of months customs have spent money more then certain amount in consecutive month or sales department requires to find out number of months a product has been sold more then certain amount in consecutive month.
If data is just in few months or just require for 2 or 3 consecutive months, then simple table join can do it. But we have like years records, and need count all consecutive months, it will could be hard to hard coding all table join for 2, 3, 4 ,5 ... consecutive months. There is an easy way using recursive CTE to solve such a problem.
For example, a table store sales records like below. And we need to list number of consecutive months for any sale who sold a product more then 2 in a month. It is can be done easily by following recursive CTE query
sample data CREATE TABLE #sales( [name] [varchar](50) NOT NULL, [saledate] [datetime] NULL, [quantity] [int] NULL ) insert into #sales(name,saledate,quantity) values ('A','2012-01-01',1), ('A','2012-02-01',2), ('A','2012-03-01',3), ('A','2012-04-01',4), ('A','2012-05-01',5), ('A','2012-06-01',6) insert into #sales(name,saledate,quantity) values ('B','2012-01-01',6), ('B','2012-02-01',2), ('B','2012-03-01',3), ('B','2012-04-01',4), ('B','2012-05-01',1), ('B','2012-06-01',6) insert into #sales(name,saledate,quantity) values ('C','2012-01-01',6), ('C','2012-02-01',1), ('C','2012-03-01',3), ('C','2012-04-01',1), ('C','2012-05-01',4), ('C','2012-06-01',1) insert into #sales(name,saledate,quantity) values ('D','2012-01-01',6), ('D','2012-02-01',3), ('D','2012-03-01',3), ('D','2012-04-01',4), ('D','2012-05-01',1), ('D','2012-06-01',6)
-- wm for all months in which product sold more 2 with wm as ( select name,saledate from #sales where quantity>2 ), -- only using above qualified records, not all records to do recursive join base_cte (name,saledate ) as ( select * from wm union all select a.name,a.saledate from wm a inner join base_cte b on a.name=b.name and a.saledate=dateadd(month,1,b.saledate) )
-- the count column indicates number of consecutive month for that month. select b.name,b.saledate, COUNT(b.name) as cnt from base_cte b group by b.name,b.saledate order by b.name,b.saledate
-- for example cnt = 2, meaning that month is the second month, from which, backward in 2 -- consecutive month, a product was sold more then 2 each month -- cnt = 3, meaning that month is the third month, from which, backward in 3 -- consecutive month, a product was sold more then 2
|
|
|
|
|
SSC Eights!
      
Group: General Forum Members
Last Login: Friday, May 17, 2013 10:07 AM
Points: 935,
Visits: 1,709
|
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Yesterday @ 9:33 PM
Points: 299,
Visits: 1,122
|
|
Using Jeff Moden's article http://www.sqlservercentral.com/articles/T-SQL/71550/ about Group Islands of Contiguous Dates as inspiration, you could do the following
;WITH cte AS ( SELECT name, DATEADD(mm, - ROW_NUMBER() OVER (ORDER BY name, saledate), saledate) dategroup, saledate FROM #sales WHERE quantity > 2 ) SELECT name, MIN(saledate) firstsaledate, COUNT(*) cnt FROM cte GROUP BY name, dategroup ORDER BY name, dategroup
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Tuesday, January 15, 2013 11:11 AM
Points: 1,945,
Visits: 2,782
|
|
Sales department requires to find out number of months a product has been sold more then certain amount in consecutive month.
If data is just in few months or just require for 2 or 3 consecutive months, then simple table join can do it.
But we have like years rows, and need count all consecutive months, it will could be hard to hard coding all table join for 2, 3, 4 , 5 ... consecutive months. There is an easy way using recursive CTE to solve such a problem.
For example, a table store Sales rows like below. And we need to list number of consecutive months for any salesman who sold a product more then 2 in a month. It is can be done easily by following recursive CTE query
CREATE TABLE Sales (salesman_name CHAR(10) NOT NULL, sale_date CHAR(10) NOT NULL, PRIMARY KEY (salesman_name, sale_date), sale_qty INTEGER NOT NULL CHECK (sale_qty > 0));
INSERT INTO Sales VALUES ('A', '2012-01-00', 1), ('A', '2012-02-00', 2), ('A', '2012-03-00', 3), ('A', '2012-04-00', 4), ('A', '2012-05-00', 5), ('A', '2012-06-00', 6),
('B', '2012-01-00', 6), ('B', '2012-02-00', 2), ('B', '2012-03-00', 3), ('B', '2012-04-00', 4), ('B', '2012-05-00', 1), ('B', '2012-06-00', 6),
('C', '2012-01-00', 6), ('C', '2012-02-00', 1), ('C', '2012-03-00', 3), ('C', '2012-04-00', 1), ('C', '2012-05-00', 4), ('C', '2012-06-00', 1),
('D', '2012-01-00', 6), ('D', '2012-02-00', 3), ('D', '2012-03-00', 3), ('D', '2012-04-00', 4), ('D', '2012-05-00', 1), ('D', '2012-06-00', 6);
WITH X1 AS (SELECT salesman_name, sale_date, sale_qty, ROW_NUMBER() OVER (PARTITION BY salesman_name ORDER BY salesman_name, sale_date) AS r1 FROM Sales), X2 AS (SELECT salesman_name, sale_date, sale_qty, r1- ROW_NUMBER() OVER (PARTITION BY salesman_name ORDER BY salesman_name, sale_date) AS sale_grp FROM X1 WHERE sale_qty > 2)
SELECT salesman_name, MIN(sale_date), MAX(sale_date) FROM X2 GROUP BY salesman_name, sale_grp;
Books in Celko Series for Morgan-Kaufmann Publishing Analytics and OLAP in SQL Data and Databases: Concepts in Practice Data, Measurements and Standards in SQL SQL for Smarties SQL Programming Style SQL Puzzles and Answers Thinking in Sets Trees and Hierarchies in SQL
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Thursday, December 13, 2012 6:34 PM
Points: 6,
Visits: 58
|
|
I know Jeff Moden's solution. But I assume this way could be faster because there is not sorting for row number. At least, it is a different solution.
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 9:57 PM
Points: 32,906,
Visits: 26,790
|
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Thursday, December 13, 2012 6:34 PM
Points: 6,
Visits: 58
|
|
There is a particular problem, let's' say, we just want to find out who sold more then 2 each month in at least two consecutive months and when.
For the above sample, below recursion would work. Also, recursion will stop immediately when it reaches the first qualified date.
with m2_cte_f (name,saledate,quantity,ind) as ( select s.*, 0 as ind from #sales s where s.saledate='2012-01-01' union all select s.*, case when s.quantity > 2 and sc.quantity > 2 then 1 else 0 end as ind from #sales s inner join m2_cte_f sc on (s.saledate = dateadd(month,1,sc.saledate) and s.name=sc.name) where sc.ind = 0 ) select * from m2_cte_f where ind=1
|
|
|
|
|
SSC Veteran
      
Group: General Forum Members
Last Login: Yesterday @ 9:33 PM
Points: 299,
Visits: 1,122
|
|
So comparing the islands and recursive queries returning similar rows
;WITH cte AS ( SELECT name, DATEADD(mm, - ROW_NUMBER() OVER (ORDER BY name, saledate), saledate) dategroup, saledate FROM #sales WHERE quantity > 2 and saledate >= '2012-01-01' ) SELECT name, max(saledate), COUNT(*) FROM cte GROUP BY name, dategroup HAVING COUNT(*) > 1 ORDER BY name, dategroup
;with m2_cte_f (name,saledate,quantity,ind) as ( select s.*, 0 as ind from #sales s where s.saledate='2012-01-01' union all select s.*, case when s.quantity > 2 and sc.quantity > 2 then 1 else 0 end as ind from #sales s inner join m2_cte_f sc on (s.saledate = dateadd(month,1,sc.saledate) and s.name=sc.name) where sc.ind = 0 ) select * from m2_cte_f where ind=1 I get the following IO stats (timing not worth mentioning 1ms each) for the small test set
(3 row(s) affected) Table '#sales____00000000009F'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(3 row(s) affected) Table 'Worktable'. Scan count 2, logical reads 91, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#sales____00000000009F'. Scan count 2, logical reads 14, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Upping the stakes a tiny bit by putting a moderate amount of data (3000 odd rows) into the table
INSERT INTO #sales (name, saledate) SELECT * FROM (SELECT * FROM (VALUES('A'),('B'),('C'),('D'),('E'),('F'),('G'),('H'),('I'),('J'),('K'),('L'),('M'),('N'),('O'),('P'),('Q'),('R'),('S'),('T')) as sales(name)) names, (SELECT TOP 156 dateadd(mm, N, '1999-12-01') saledate FROM Tally) as months
UPDATE #sales SET quantity = RAND(Checksum(Newid())) * 5
CREATE CLUSTERED INDEX SALES_IDX1 ON #sales (saledate, name)
I get the following
(22 row(s) affected) Table '#sales__0000000000A4'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms.
(15 row(s) affected) Table 'Worktable'. Scan count 2, logical reads 635, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#sales__0000000000A4'. Scan count 96, logical reads 193, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 3 ms.
As I added more rows to the table the recursive query got very gradually slower and did more reads, while the islands query remained static. I got up to 75816 rows. Would you believe the had sales data back to 1770 for the same 26 people 
(27 row(s) affected) Table '#sales__0000000000AD'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms.
(19 row(s) affected) Table 'Worktable'. Scan count 2, logical reads 1193, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#sales__0000000000AD'. Scan count 189, logical reads 380, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 5 ms.
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 9:57 PM
Points: 32,906,
Visits: 26,790
|
|
bj_shenglong (12/5/2012) There is a particular problem, let's' say, we just want to find out who sold more then 2 each month in at least two consecutive months and when.
For the above sample, below recursion would work. Also, recursion will stop immediately when it reaches the first qualified date.
Although they can be fast, recursive CTEs are still procedural in nature. The only way to know for sure is to do a test.
{Edit} Was distracted by a code promotion going on at work and I see that MickyT made just such a test. Thank you, good Sir!
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Thursday, May 09, 2013 10:22 PM
Points: 103,
Visits: 161
|
|
this is my contribution
USE tempdb GO IF OBJECT_ID('TestTbl') IS NOT NULL DROP TABLE TestTbl
CREATE TABLE Testtbl (id INT PRIMARY KEY)
INSERT INTO Testtbl ( id ) SELECT TOP 1000 ROW_NUMBER() OVER ( ORDER BY c.object_id) id FROM sys.[columns] c ,sys.[columns] c2
DELETE FROM TestTbl WHERE id IN(SELECT top 100 ABS(CHECKSUM(NEWID())%1000) FROM sys.[columns] c)
DELETE FROM TestTbl WHERE id IN(SELECT top 100 ABS(CHECKSUM(NEWID())%1000) FROM sys.[columns] c)
SELECT * FROM TestTbl;
WITH S AS ( SELECT ROW_NUMBER() OVER (order by t.id) AS RN,t.id FROM TestTbl t LEFT OUTER JOIN TestTbl t2 ON t.id -1= t2.id WHERE t2.id IS NULL ),E AS ( SELECT ROW_NUMBER() OVER (order by t.id) AS RN, t.id FROM TestTbl t LEFT OUTER JOIN TestTbl t2 ON t.id +1= t2.id WHERE t2.id IS NULL )
SELECT s.id AS [START], e.id AS [END] FROM S INNER JOIN E ON s.Rn= E.rn GO
using this i try to solve your problem
USE tempdb GO
IF OBJECT_ID('sales') IS NOT NULL DROP TABLE sales CREATE TABLE sales( [name] [varchar](50) NOT NULL, [saledate] [datetime] NULL, [quantity] [int] NULL ) GO Insert into sales(name,saledate,quantity) values ('A','2012-01-01',1),('A','2012-02-01',2),('A','2012-03-01',3), ('A','2012-04-01',4),('A','2012-05-01',5),('A','2012-06-01',6),
('B','2012-01-01',6),('B','2012-02-01',2),('B','2012-03-01',3), ('B','2012-04-01',4),('B','2012-05-01',1),('B','2012-06-01',6),
('C','2012-01-01',6),('C','2012-02-01',1),('C','2012-03-01',3), ('C','2012-04-01',1),('C','2012-05-01',4),('C','2012-06-01',1),
('D','2012-01-01',6),('D','2012-02-01',3),('D','2012-03-01',3), ('D','2012-04-01',4),('D','2012-05-01',1),('D','2012-06-01',6); WITH Fil AS ( SELECT * FROM sales WHERE quantity > 2 ), S AS ( SELECT ROW_NUMBER() OVER (ORDER BY t.name) AS RN, t.NAME, MONTH (t.saledate) AS id FROM Fil t LEFT OUTER JOIN Fil t2 ON MONTH (t.saledate) -1 = MONTH (t2.saledate) AND t2.name = t.name WHERE t2.saledate IS NULL ), E AS ( SELECT ROW_NUMBER() OVER (ORDER BY t.name) AS RN, t.NAME, MONTH (t.saledate) AS id FROM Fil t LEFT OUTER JOIN Fil t2 ON MONTH (t.saledate) + 1 = MONTH (t2.saledate) AND t2.name = t.name WHERE t2.saledate IS NULL ), Gap AS( SELECT e.NAME, s.id AS [DateStart], e.id AS [DateEnd] FROM S INNER JOIN E ON s.Rn = E.rn --AND s.id<>e.id )
SELECT g.NAME ,g.Datestart, sum(CASE WHEN(sales.quantity>2) AND MONTH(sales.saledate) BETWEEN g.datestart AND g.dateend THEN 1 ELSE 0 END ) as Res FROM sales INNER JOIN gap g ON sales.name =g.NAME AND month(sales.saledate)>= g.datestart GROUP BY g.name,g.Datestart ORDER BY g.name,g.Datestart
i miss the criteria
|
|
|
|