• Paul White (2/27/2010)


    It is certainly orders of magnitude faster than a recursive CTE on large sets

    Hi Paul, an excellent solution as always, and a method to remember.

    There are a couple of things I've noticed while working with CTE's over the last few weeks of using them in anger. The first is that, even with relatively small data sets, chaining CTE's can be a really bad idea. The optimiser appears to screw things up, row counts go through the roof and runtimes rapidly escalate. I've got an excellent example of this where breaking the chain by spooling into a temp table improved performance about 20-fold on a modest pagination routine. Funny thing is, the very first run took less than a second, before the optimiser lost it.

    The other thing I've noticed is that recursive CTE's can be blisteringly fast provided that the required conditions are met.

    So, reusing Jeff's sample data (thanks Jeff), clumsily ramping up to about about 3,000 accounts across about 100,000 rows, and processing a data set which already contains a contiguous row number in the correct order for execution, here goes...

    First, the sample data:

    --=====================================================================================================================

    -- This is the original test data including the second account you added this evening

    --=====================================================================================================================

    SET NOCOUNT ON

    IF OBJECT_ID('TempDB..#InterestRates','U') IS NOT NULL

    DROP TABLE #InterestRates

    ;

    CREATE TABLE #InterestRates

    (

    [InterestRate] DECIMAL(28,17) NULL,

    [Month] [int] NULL,

    [year] [int] NULL

    )

    ;

    INSERT INTO #InterestRates

    (InterestRate,[Month],[Year])

    SELECT '0.105','8','2007' UNION ALL

    SELECT '0.105','9','2007' UNION ALL

    SELECT '0.105','10','2007' UNION ALL

    SELECT '0.105','11','2007' UNION ALL

    SELECT '0.105','12','2007' UNION ALL

    SELECT '0.105','1','2008' UNION ALL

    SELECT '0.105','2','2008' UNION ALL

    SELECT '0.105','3','2008' UNION ALL

    SELECT '0.105','4','2008' UNION ALL

    SELECT '0.105','5','2008' UNION ALL

    SELECT '0.105','6','2008' UNION ALL

    SELECT '0.105','7','2008' UNION ALL

    SELECT '0.105','8','2008' UNION ALL

    SELECT '0.105','9','2008' UNION ALL

    SELECT '0.105','10','2008' UNION ALL

    SELECT '0.105','11','2008' UNION ALL

    SELECT '0.105','12','2008' UNION ALL

    SELECT '0.105','1','2009' UNION ALL

    SELECT '0.105','2','2009' UNION ALL

    SELECT '0.105','3','2009' UNION ALL

    SELECT '0.105','4','2009' UNION ALL

    SELECT '0.105','5','2009' UNION ALL

    SELECT '0.105','6','2009' UNION ALL

    SELECT '0.0067','7','2009' UNION ALL

    SELECT '0.0067','8','2009' UNION ALL

    SELECT '0.0067','9','2009' UNION ALL

    SELECT '0.0055','10','2009' UNION ALL

    SELECT '0.0055','11','2009' UNION ALL

    SELECT '0.0055','12','2009' UNION ALL

    SELECT '0.0055','1','2010' UNION ALL

    SELECT '0.0055','2','2010'

    ;

    IF OBJECT_ID('TempDB..#Refunds','U') IS NOT NULL

    DROP TABLE #Refunds

    ;

    CREATE TABLE #Refunds

    (

    [Acct_no] varchar(20) NULL,

    [Amount] DECIMAL(28,17) NULL,

    [Month] [int] NULL,

    [year] [int] NULL

    )

    ;

    INSERT INTO #Refunds

    (Acct_no,Amount, [Month],[year])

    SELECT '1271003600','333.107456586203','1','2008' UNION ALL

    SELECT '1271003600','87.6816131178288','1','2009' UNION ALL

    SELECT '1271003600','103.602002310821','2','2008' UNION ALL

    SELECT '1271003600','81.9722910242125','2','2009' UNION ALL

    SELECT '1271003600','72.3401647559977','3','2008' UNION ALL

    SELECT '1271003600','86.1625895742832','3','2009' UNION ALL

    SELECT '1271003600','103.361218989159','4','2008' UNION ALL

    SELECT '1271003600','84.6065155612006','4','2009' UNION ALL

    SELECT '1271003600','92.2834655375568','5','2008' UNION ALL

    SELECT '1271003600','84.6688365382596','5','2009' UNION ALL

    SELECT '1271003600','89.0687598341003','6','2008' UNION ALL

    SELECT '1271003600','79.1241550533365','6','2009' UNION ALL

    SELECT '1271003600','123.131877952709','7','2008' UNION ALL

    SELECT '1271003600','89.3938505146093','7','2009' UNION ALL

    SELECT '1271003600','87.2510378856992','8','2007' UNION ALL

    SELECT '1271003600','94.0687846832911','8','2008' UNION ALL

    SELECT '1271003600','340.381691069552','8','2009' UNION ALL

    SELECT '1271003600','80.3632310400272','9','2007' UNION ALL

    SELECT '1271003600','121.96840107118','9','2008' UNION ALL

    SELECT '1271003600','87.1037670414245','10','2007' UNION ALL

    SELECT '1271003600','94.9861134399298','10','2008' UNION ALL

    SELECT '1271003600','73.8008470324637','11','2007' UNION ALL

    SELECT '1271003600','78.537373182859','11','2008' UNION ALL

    SELECT '1271003600','90.7988754111144','12','2007' UNION ALL

    SELECT '1271003600','88.4954811818565','12','2008' UNION ALL

    SELECT '1271005700 ','651.764','1','2008' UNION ALL

    SELECT '1271005700 ','650.766','1','2009' UNION ALL

    SELECT '1271005700 ','897.994','2','2008' UNION ALL

    SELECT '1271005700 ','688.486','2','2009' UNION ALL

    SELECT '1271005700 ','721.466','3','2008' UNION ALL

    SELECT '1271005700 ','718.654','3','2009' UNION ALL

    SELECT '1271005700 ','913.072','4','2008' UNION ALL

    SELECT '1271005700 ','863.224','4','2009' UNION ALL

    SELECT '1271005700 ','735.081','5','2008' UNION ALL

    SELECT '1271005700 ','647.368','5','2009' UNION ALL

    SELECT '1271005700 ','607.721','6','2008' UNION ALL

    SELECT '1271005700 ','580.359','6','2009' UNION ALL

    SELECT '1271005700 ','718.969','7','2008' UNION ALL

    SELECT '1271005700 ','693.378','7','2009' UNION ALL

    SELECT '1271005700 ','734.005','8','2007' UNION ALL

    SELECT '1271005700 ','591.788','8','2008' UNION ALL

    SELECT '1271005700 ','679.773','8','2009' UNION ALL

    SELECT '1271005700 ','560.296','9','2007' UNION ALL

    SELECT '1271005700 ','685.047','9','2008' UNION ALL

    SELECT '1271005700 ','682.834','10','2007' UNION ALL

    SELECT '1271005700 ','608.706','10','2008' UNION ALL

    SELECT '1271005700 ','586.172','11','2007' UNION ALL

    SELECT '1271005700 ','650.191','11','2008' UNION ALL

    SELECT '1271005700 ','858.672','12','2007' UNION ALL

    SELECT '1271005700 ','711.708','12','2008'

    ;

    --=====================================================================================================================

    -- Solution to the problem starts here

    --=====================================================================================================================

    --===== Conditionally drop the work table to make reruns for any troubleshooting easier.

    IF OBJECT_ID('TempDB..#Work','U') IS NOT NULL

    DROP TABLE #Work;

    --===== This set of cascading CTE's preconditions the data for processing and later display

    WITH

    cteAccount AS

    ( --=== Find all distinct account numbers so we can apply to all dates for rates without dupes

    SELECT DISTINCT Acct_No

    FROM #Refunds

    ),

    cteRate AS

    ( --=== Apply the account number to all date rates without dupes

    SELECT acct.Acct_No,

    rate.[Year],

    rate.[Month],

    rate.InterestRate/12 AS MonthRate

    FROM #InterestRates rate

    CROSS JOIN cteAccount acct

    )

    SELECT ExecSeq = ISNULL(CAST(0 AS INT), 0),

    ISNULL(rate.Acct_No,'') AS Acct_No,

    ISNULL(DATEADD(mm,rate.[Month],DATEADD(yy,rate.[YEAR]-1900,0)),0) AS MonthDate,

    rate.MonthRate,

    ISNULL(amt.Amount,0) AS Amount,

    CAST(NULL AS DECIMAL(28,17)) AS InterestAmount,

    CAST(NULL AS DECIMAL(28,17)) AS Balance

    INTO #Work

    FROM cteRate rate

    LEFT JOIN #Refunds amt

    ON rate.Acct_No = amt.Acct_No

    AND rate.[Year] = amt.[Year]

    AND rate.[Month] = amt.[Month]

    ;

    -- scale up the table yuck but it's a one-off

    DECLARE @Acct_no INT, @Counter INT

    SET @Counter = 1

    WHILE @Counter < 3220 -- 322 corresponds to 10,000-ish rows

    BEGIN

    SELECT @Acct_no = MAX(Acct_no) FROM #Work

    INSERT INTO #Work (ExecSeq, Acct_no, MonthDate, MonthRate, Amount)

    SELECT ExecSeq, Acct_no = CAST(@Acct_no+1 AS VARCHAR(20)), MonthDate, MonthRate, Amount

    FROM #Work WHERE Acct_no = CAST(@Acct_no AS VARCHAR(20)) -- 25 rows

    SET @Counter = @Counter + 1

    END

    -- put in the all important execution order

    UPDATE w SET ExecSeq = t.ExecSeq

    FROM #Work w

    INNER JOIN (SELECT ExecSeq = ROW_NUMBER() OVER(ORDER BY Acct_no, MonthDate), Acct_no, MonthDate FROM #Work) t

    ON t.Acct_no = w.Acct_no AND t.MonthDate = w.MonthDate

    --===== Add the very useful clustered index

    CREATE CLUSTERED INDEX [ExecSeq] ON #Work ([ExecSeq] ASC)

    Here's the slightly modified solution:

    ;WITH Calculator AS (

    SELECT ExecSeq,

    Acct_no,

    Amount,

    MonthDate,

    MonthRate,

    InterestAmount = ISNULL(InterestAmount,0)

    FROM #Work w

    WHERE ExecSeq = 1

    UNION ALL

    SELECT cr.ExecSeq,

    ISNULL(cr.Acct_no, lr.Acct_no),

    ISNULL(cr.Amount, 0),

    cr.MonthDate,

    cr.MonthRate,

    InterestAmount = CAST(CASE WHEN cr.Acct_no = lr.Acct_no THEN (lr.Amount + lr.InterestAmount) * (1 + (cr.MonthRate)) ELSE 0 END AS DECIMAL(28,17))

    FROM Calculator lr

    INNER JOIN #Work cr ON cr.ExecSeq = lr.ExecSeq+1

    )

    SELECT * FROM Calculator ORDER BY ExecSeq OPTION (MAXRECURSION 0)

    This runs in 5 seconds, or 3 if the ORDER BY is removed (???). If a single row is selected, say the second last (ExecSeq = 99850), it takes about 3 seconds and the result is correct.

    100,000 rows, 5 seconds - that's not a misprint either Uncle Jeff 😛

    A running total over a million rows of a simple but properly structured and indexed table can take as little as three or four seconds using this method.

    Option 1, version 2.

    Nathan, regarding recursive CTE's; "the reference to self in the recursive part represents the results of the previous iteration". It might be nowhere near what SQL Server is doing under the bonnet but it's all you need to know to make recursive CTE's work.

    Cheers

    ChrisM


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]