• mickyT (12/5/2012)


    So comparing the islands and recursive queries returning similar rows

    ;WITH cte AS (

    SELECT name,

    DATEADD(mm, - ROW_NUMBER() OVER (ORDER BY name, saledate), saledate) dategroup,

    saledate

    FROM #sales

    WHERE quantity > 2 and saledate >= '2012-01-01'

    )

    SELECT name, max(saledate), COUNT(*)

    FROM cte

    GROUP BY name, dategroup

    HAVING COUNT(*) > 1

    ORDER BY name, dategroup

    ;with m2_cte_f (name,saledate,quantity,ind) as (

    select s.*, 0 as ind

    from #sales s

    where s.saledate='2012-01-01'

    union all

    select s.*, case when s.quantity > 2 and sc.quantity > 2 then 1 else 0 end as ind

    from #sales s

    inner join m2_cte_f sc

    on (s.saledate = dateadd(month,1,sc.saledate) and s.name=sc.name)

    where sc.ind = 0

    )

    select * from m2_cte_f where ind=1

    I get the following IO stats (timing not worth mentioning 1ms each) for the small test set

    (3 row(s) affected)

    Table '#sales____00000000009F'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (3 row(s) affected)

    Table 'Worktable'. Scan count 2, logical reads 91, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#sales____00000000009F'. Scan count 2, logical reads 14, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Upping the stakes a tiny bit by putting a moderate amount of data (3000 odd rows) into the table

    INSERT INTO #sales (name, saledate)

    SELECT *

    FROM

    (SELECT * FROM (VALUES('A'),('B'),('C'),('D'),('E'),('F'),('G'),('H'),('I'),('J'),('K'),('L'),('M'),('N'),('O'),('P'),('Q'),('R'),('S'),('T')) as sales(name)) names,

    (SELECT TOP 156 dateadd(mm, N, '1999-12-01') saledate FROM Tally) as months

    UPDATE #sales

    SET quantity = RAND(Checksum(Newid())) * 5

    CREATE CLUSTERED INDEX SALES_IDX1 ON #sales (saledate, name)

    Thanks for all updates on this topic.

    I have one question for this test.

    It seems the test data starts at 1999-12-01, but queries use '2012-01-01' to start. I am not sure what the end date is.

    I was wondering if it makes any difference after we make some change on start date in these 2 queries, so that start date = the earlies test date, then sorting may go through all period for query 1. Of course, recursion will have more joins for query 2.