|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Tuesday, May 07, 2013 12:19 PM
Points: 63,
Visits: 144
|
|
| Well after 1 hour and 11 minutes it reached the maximum recursion of 32767 and terminated. I'm going to try and figure out what exactly this query is doing because it works beautifully when there is only one account but when you add more than one account it breaks. I'm going to edit the sample data to include another account. Sorry about that.
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 7:51 PM
Points: 32,910,
Visits: 26,800
|
|
nathan 7372 (2/26/2010) Well after 1 hour and 11 minutes it reached the maximum recursion of 32767 and terminated. I'm going to try and figure out what exactly this query is doing because it works beautifully when there is only one account but when you add more than one account it breaks. I'm going to edit the sample data to include another account. Sorry about that.
I was afraid of that when I saw the recursive CTE. They actually tend to be a bit slower than even a cursor. How many rows in total are you processing?
Also, what do you mean by "Refund"? Are you talking about the Amount column?
If all goes well, the example I'm writing should give you what you want at a rate of about a million rows every 7 seconds or so. Nope... that's not a misprint.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Tuesday, May 07, 2013 12:19 PM
Points: 63,
Visits: 144
|
|
| Yeah I'm working with refunds so refund=amount. I have 222K rows of data which comes out to about 11K accounts. I will definatly be interested to see your query especially if it puts out the performance you claim. Thanks
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 7:51 PM
Points: 32,910,
Visits: 26,800
|
|
nathan 7372 (2/26/2010) Yeah I'm working with refunds so refund=amount. I have 222K rows of data which comes out to about 11K accounts. I will definatly be interested to see your query especially if it puts out the performance you claim. Thanks
Do you have a running balance column in your real table that you need to have filled in? If so, I need you to post the create statement including all indexes for the real refund table, please. It's important.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Tuesday, May 07, 2013 12:19 PM
Points: 63,
Visits: 144
|
|
| Currently there is no running balance column in the table. If I need to add one let me know. I also have no indexes at this point. I probably should add some but I'm not entirely sure what would be a useful index at this point.
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 7:51 PM
Points: 32,910,
Visits: 26,800
|
|
nathan 7372 (2/26/2010) Currently there is no running balance column in the table. If I need to add one let me know. I also have no indexes at this point. I probably should add some but I'm not entirely sure what would be a useful index at this point.
Thanks for the quick feedback. Not to worry... let's peel one potato at a time. First, we'll get you the answer you need and then we'll work on your table. I'm almost done.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 7:51 PM
Points: 32,910,
Visits: 26,800
|
|
Ok... here we go. This method is commonly referred to as the "Quirky Update" method and it's appropriately named because if you don't follow all the rules to a "T", it'll get "quirky" on you and it'll come up with the wrong answers. If you do follow the rules, it's rock solid accurate and nasty fast.
There are some rules that won't be apparent in the code below. To wit, I'd really like you to take the time (it's 22 pages long in MS Word) to read the article at the following URL. Don't let the warnings scare you if you follow the rules. Sorry to sound brutal about this but I value your data as much as I do my own so if you think you're smarter than the rules and decide not to follow the rules, then don't use the "Quirky Update" method. Instead, use a well written FORWARD ONLY, READ ONLY, STATIC cursor which (not including a CLR) is the third fastest method (the second fastest is a bit more complicated) for doing running totals of this nature.
http://www.sqlservercentral.com/articles/T-SQL/68467/
Now, if you do decide to follow the rules, prepare to be absolutely amazed. The following code includes the test data you currently have in your original post as well as the code to do your running balances for all accounts in a Temp table. If it seems like it didn't actually run, that's the amazing part... shear blinding speed. Your 222k rows should take about 2 seconds to run and most of that is setting up the Temp Table and the required clustered index. The run time doesn't include the display time.
I should also mention that I shifted from FLOAT to 17 place decimal numbers. You'll find I didn't come up with the same total for the one account as you did because I'm running just a bit more accurate because less rounding is occuring (I think). I also split out the monthly interest so you can see how much it actually is. The "Balance" column is what you were referring to as the "Interest(Month)".
Here's the code:
--===================================================================================================================== -- This is the original test data including the second account you added this evening --===================================================================================================================== IF OBJECT_ID('TempDB..#InterestRates','U') IS NOT NULL DROP TABLE #InterestRates ; CREATE TABLE #InterestRates ( [InterestRate] DECIMAL(28,17) NULL, [Month] [int] NULL, [year] [int] NULL ) ; INSERT INTO #InterestRates (InterestRate,[Month],[Year]) SELECT '0.105','8','2007' UNION ALL SELECT '0.105','9','2007' UNION ALL SELECT '0.105','10','2007' UNION ALL SELECT '0.105','11','2007' UNION ALL SELECT '0.105','12','2007' UNION ALL SELECT '0.105','1','2008' UNION ALL SELECT '0.105','2','2008' UNION ALL SELECT '0.105','3','2008' UNION ALL SELECT '0.105','4','2008' UNION ALL SELECT '0.105','5','2008' UNION ALL SELECT '0.105','6','2008' UNION ALL SELECT '0.105','7','2008' UNION ALL SELECT '0.105','8','2008' UNION ALL SELECT '0.105','9','2008' UNION ALL SELECT '0.105','10','2008' UNION ALL SELECT '0.105','11','2008' UNION ALL SELECT '0.105','12','2008' UNION ALL SELECT '0.105','1','2009' UNION ALL SELECT '0.105','2','2009' UNION ALL SELECT '0.105','3','2009' UNION ALL SELECT '0.105','4','2009' UNION ALL SELECT '0.105','5','2009' UNION ALL SELECT '0.105','6','2009' UNION ALL SELECT '0.0067','7','2009' UNION ALL SELECT '0.0067','8','2009' UNION ALL SELECT '0.0067','9','2009' UNION ALL SELECT '0.0055','10','2009' UNION ALL SELECT '0.0055','11','2009' UNION ALL SELECT '0.0055','12','2009' UNION ALL SELECT '0.0055','1','2010' UNION ALL SELECT '0.0055','2','2010' ; IF OBJECT_ID('TempDB..#Refunds','U') IS NOT NULL DROP TABLE #Refunds ; CREATE TABLE #Refunds ( [Acct_no] varchar(20) NULL, [Amount] DECIMAL(28,17) NULL, [Month] [int] NULL, [year] [int] NULL ) ; INSERT INTO #Refunds (Acct_no,Amount, [Month],[year]) SELECT '1271003600','333.107456586203','1','2008' UNION ALL SELECT '1271003600','87.6816131178288','1','2009' UNION ALL SELECT '1271003600','103.602002310821','2','2008' UNION ALL SELECT '1271003600','81.9722910242125','2','2009' UNION ALL SELECT '1271003600','72.3401647559977','3','2008' UNION ALL SELECT '1271003600','86.1625895742832','3','2009' UNION ALL SELECT '1271003600','103.361218989159','4','2008' UNION ALL SELECT '1271003600','84.6065155612006','4','2009' UNION ALL SELECT '1271003600','92.2834655375568','5','2008' UNION ALL SELECT '1271003600','84.6688365382596','5','2009' UNION ALL SELECT '1271003600','89.0687598341003','6','2008' UNION ALL SELECT '1271003600','79.1241550533365','6','2009' UNION ALL SELECT '1271003600','123.131877952709','7','2008' UNION ALL SELECT '1271003600','89.3938505146093','7','2009' UNION ALL SELECT '1271003600','87.2510378856992','8','2007' UNION ALL SELECT '1271003600','94.0687846832911','8','2008' UNION ALL SELECT '1271003600','340.381691069552','8','2009' UNION ALL SELECT '1271003600','80.3632310400272','9','2007' UNION ALL SELECT '1271003600','121.96840107118','9','2008' UNION ALL SELECT '1271003600','87.1037670414245','10','2007' UNION ALL SELECT '1271003600','94.9861134399298','10','2008' UNION ALL SELECT '1271003600','73.8008470324637','11','2007' UNION ALL SELECT '1271003600','78.537373182859','11','2008' UNION ALL SELECT '1271003600','90.7988754111144','12','2007' UNION ALL SELECT '1271003600','88.4954811818565','12','2008' UNION ALL SELECT '1271005700 ','651.764','1','2008' UNION ALL SELECT '1271005700 ','650.766','1','2009' UNION ALL SELECT '1271005700 ','897.994','2','2008' UNION ALL SELECT '1271005700 ','688.486','2','2009' UNION ALL SELECT '1271005700 ','721.466','3','2008' UNION ALL SELECT '1271005700 ','718.654','3','2009' UNION ALL SELECT '1271005700 ','913.072','4','2008' UNION ALL SELECT '1271005700 ','863.224','4','2009' UNION ALL SELECT '1271005700 ','735.081','5','2008' UNION ALL SELECT '1271005700 ','647.368','5','2009' UNION ALL SELECT '1271005700 ','607.721','6','2008' UNION ALL SELECT '1271005700 ','580.359','6','2009' UNION ALL SELECT '1271005700 ','718.969','7','2008' UNION ALL SELECT '1271005700 ','693.378','7','2009' UNION ALL SELECT '1271005700 ','734.005','8','2007' UNION ALL SELECT '1271005700 ','591.788','8','2008' UNION ALL SELECT '1271005700 ','679.773','8','2009' UNION ALL SELECT '1271005700 ','560.296','9','2007' UNION ALL SELECT '1271005700 ','685.047','9','2008' UNION ALL SELECT '1271005700 ','682.834','10','2007' UNION ALL SELECT '1271005700 ','608.706','10','2008' UNION ALL SELECT '1271005700 ','586.172','11','2007' UNION ALL SELECT '1271005700 ','650.191','11','2008' UNION ALL SELECT '1271005700 ','858.672','12','2007' UNION ALL SELECT '1271005700 ','711.708','12','2008' ; --===================================================================================================================== -- Solution to the problem starts here --=====================================================================================================================
--===== Conditionally drop the work table to make reruns for any troubleshooting easier. IF OBJECT_ID('TempDB..#Work','U') IS NOT NULL DROP TABLE #Work ; --===== Declare local variables to support the upcoming running total update DECLARE @PrevAcct VARCHAR(20), @PrevBal DECIMAL(28,17), @IntAmt DECIMAL(28,17) ; --===== This set of cascading CTE's preconditions the data for processing and later display WITH cteAccount AS ( --=== Find all distinct account numbers so we can apply to all dates for rates without dupes SELECT DISTINCT Acct_No FROM #Refunds ), cteRate AS ( --=== Apply the account number to all date rates without dupes SELECT acct.Acct_No, rate.[Year], rate.[Month], rate.InterestRate/12 AS MonthRate FROM #InterestRates rate CROSS JOIN cteAccount acct ) SELECT ISNULL(rate.Acct_No,'') AS Acct_No, ISNULL(DATEADD(mm,rate.[Month],DATEADD(yy,rate.[YEAR]-1900,0)),0) AS MonthDate, rate.MonthRate, ISNULL(amt.Amount,0) AS Amount, CAST(NULL AS DECIMAL(28,17)) AS InterestAmount, CAST(NULL AS DECIMAL(28,17)) AS Balance INTO #Work FROM cteRate rate LEFT JOIN #Refunds amt ON rate.Acct_No = amt.Acct_No AND rate.[Year] = amt.[Year] AND rate.[Month] = amt.[Month] ; --===== Add the required clustered index ALTER TABLE #Work ADD PRIMARY KEY CLUSTERED (Acct_No, MonthDate) WITH FILLFACTOR = 100 ; --===== Do the update to calculate the interest amounts and the running total balance. -- This will do a million rows in about 3 to 7 seconds UPDATE #Work SET @IntAmt = InterestAmount = CASE WHEN @PrevAcct = Acct_No THEN (@PrevBal + Amount) * Monthrate ELSE (Amount) * MonthRate END, @PrevBal = Balance = CASE WHEN @PrevAcct = Acct_No THEN @PrevBal + Amount + @IntAmt ELSE Amount + @IntAmt END, @PrevAcct = Acct_No FROM #Work WITH(TABLOCKX) --Not required on a temp table, but good habit for this type of update. OPTION (MAXDOP 1) --Absolutely essential whether you use a temp table or not. ;
... and here's a bit on how you might display things... we can round the amounts to 2 decimal places, if you'd like.
SELECT Acct_No, RIGHT(CONVERT(CHAR(11),MonthDate,113),8) AS [Month], Amount, InterestAmount, Balance FROM #Work ORDER BY Acct_No, MonthDate ;
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 5:10 PM
Points: 10,989,
Visits: 10,535
|
|
For completeness, I'm going to add a new method to the mix (before anyone else jumps in!).
This is a method called 'set-based iteration' by MVP Hugo Kornelis, which was first brought to my attention here in a discussion of one of Jeff's fine articles.
While not quite as fast as the Quirky Update, it is in the same ballpark. It is faster than a recursive CTE on large sets, and even manages to out-perform the best CLR solution I have been able to put together, at least for some data sets.
The key to performance with the set-based iteration method is to have at least several groups in the data. It operates on a divide-and-conquer basis, and the unit of division is the group. In the present case, these groups are accounts - in general, they are the things that reset the running total. Worst case, if there is only one group, this method performs worse than an optimized cursor - so be aware of that.
In the sample data provided, there are two accounts, which is enough to make the method viable, but certainly not optimal. My guess is that the real data set contains many accounts, so that should be all good.
I have shamelessly re-used Jeff's fine test script, up to the point marked 'Solution to the problem starts here'. The following code runs from that point, and again draws heavily on Jeff's work. I hope he won't mind.
-- A work table to hold the reformatted data, and -- ultimately, the results CREATE TABLE #Work ( Acct_No VARCHAR(20) NOT NULL, MonthDate DATETIME NOT NULL, MonthRate DECIMAL(19,12) NOT NULL, Amount DECIMAL(19,12) NOT NULL, InterestAmount DECIMAL(19,12) NOT NULL, RunningTotal DECIMAL(19,12) NOT NULL, RowRank BIGINT NOT NULL );
-- Slightly modified from Jeff's example, to prepare -- for the set-based iteration method WITH Accounts AS ( -- Get a list of the account numbers SELECT DISTINCT Acct_No FROM #Refunds ), Rates AS ( -- Apply all the accounts to all the rates SELECT A.Acct_No, R.[Year], R.[Month], MonthRate = R.InterestRate / 12 FROM #InterestRates R CROSS JOIN Accounts A ), BaseData AS ( -- The basic data we need to work with SELECT Acct_No = ISNULL(R.Acct_No,''), MonthDate = ISNULL(DATEADD(MONTH, R.[Month], DATEADD(YEAR, R.[year] - 1900, 0)), 0), R.MonthRate, Amount = ISNULL(RF.Amount,0), InterestAmount = ISNULL(RF.Amount,0) * R.MonthRate, RunningTotal = ISNULL(RF.Amount,0) FROM Rates R LEFT JOIN #Refunds RF ON RF.Acct_No = R.Acct_No AND RF.[Year] = R.[Year] AND RF.[Month] = R.[Month] ) -- Basic data plus a rank id, numbering the rows by MonthDate, and resetting to 1 for each new Account INSERT #Work (Acct_No, MonthDate, MonthRate, Amount, InterestAmount, RunningTotal, RowRank) SELECT BD.Acct_No, BD.MonthDate, BD.MonthRate, BD.Amount, BD.InterestAmount, BD.RunningTotal, RowRank = RANK() OVER (PARTITION BY BD.Acct_No ORDER BY MonthDate) FROM BaseData BD;
-- An index to speed the next stage (different from that used with the Quirky Update method) CREATE UNIQUE CLUSTERED INDEX nc1 ON #Work (RowRank, Acct_No);
-- Iteration variables DECLARE @Rank BIGINT, @RowCount INTEGER;
-- Initialize SELECT @Rank = 1, @RowCount = 1;
-- This is the iteration bit, processes a rank id per iteration -- The number of rows processed with each iteration is equal to the number of groups in the data -- More groups --> greater efficiency WHILE (1 = 1) BEGIN SET @Rank = @Rank + 1; -- Set-based update with running totals for the current rank id UPDATE This SET InterestAmount = (Previous.RunningTotal + This.Amount) * This.MonthRate, RunningTotal = Previous.RunningTotal + This.Amount + (Previous.RunningTotal + This.Amount) * This.MonthRate FROM #Work This JOIN #Work Previous ON Previous.Acct_No = This.Acct_No AND Previous.RowRank = @Rank - 1 WHERE This.RowRank = @Rank; IF (@@ROWCOUNT = 0) BREAK; END;
-- Show the results in natural order SELECT * FROM #Work ORDER BY Acct_No, RowRank;
I continue to recommend the Quirky Update over this method, for sheer speed, but this method does have the advantage of being fully documented and supported. I present it as 'Option 2'.
Paul
edit: to reflect the improvements in the recursive CTE method
Paul White SQL Server MVP SQLblog.com @SQL_Kiwi
|
|
|
|
|
SSC Eights!
      
Group: General Forum Members
Last Login: Today @ 10:09 AM
Points: 921,
Visits: 3,744
|
|
Paul White (2/27/2010) It is certainly orders of magnitude faster than a recursive CTE on large sets
Hi Paul, an excellent solution as always, and a method to remember.
There are a couple of things I've noticed while working with CTE's over the last few weeks of using them in anger. The first is that, even with relatively small data sets, chaining CTE's can be a really bad idea. The optimiser appears to screw things up, row counts go through the roof and runtimes rapidly escalate. I've got an excellent example of this where breaking the chain by spooling into a temp table improved performance about 20-fold on a modest pagination routine. Funny thing is, the very first run took less than a second, before the optimiser lost it. The other thing I've noticed is that recursive CTE's can be blisteringly fast provided that the required conditions are met.
So, reusing Jeff's sample data (thanks Jeff), clumsily ramping up to about about 3,000 accounts across about 100,000 rows, and processing a data set which already contains a contiguous row number in the correct order for execution, here goes...
First, the sample data:
--===================================================================================================================== -- This is the original test data including the second account you added this evening --===================================================================================================================== SET NOCOUNT ON
IF OBJECT_ID('TempDB..#InterestRates','U') IS NOT NULL DROP TABLE #InterestRates ; CREATE TABLE #InterestRates ( [InterestRate] DECIMAL(28,17) NULL, [Month] [int] NULL, [year] [int] NULL ) ; INSERT INTO #InterestRates (InterestRate,[Month],[Year]) SELECT '0.105','8','2007' UNION ALL SELECT '0.105','9','2007' UNION ALL SELECT '0.105','10','2007' UNION ALL SELECT '0.105','11','2007' UNION ALL SELECT '0.105','12','2007' UNION ALL SELECT '0.105','1','2008' UNION ALL SELECT '0.105','2','2008' UNION ALL SELECT '0.105','3','2008' UNION ALL SELECT '0.105','4','2008' UNION ALL SELECT '0.105','5','2008' UNION ALL SELECT '0.105','6','2008' UNION ALL SELECT '0.105','7','2008' UNION ALL SELECT '0.105','8','2008' UNION ALL SELECT '0.105','9','2008' UNION ALL SELECT '0.105','10','2008' UNION ALL SELECT '0.105','11','2008' UNION ALL SELECT '0.105','12','2008' UNION ALL SELECT '0.105','1','2009' UNION ALL SELECT '0.105','2','2009' UNION ALL SELECT '0.105','3','2009' UNION ALL SELECT '0.105','4','2009' UNION ALL SELECT '0.105','5','2009' UNION ALL SELECT '0.105','6','2009' UNION ALL SELECT '0.0067','7','2009' UNION ALL SELECT '0.0067','8','2009' UNION ALL SELECT '0.0067','9','2009' UNION ALL SELECT '0.0055','10','2009' UNION ALL SELECT '0.0055','11','2009' UNION ALL SELECT '0.0055','12','2009' UNION ALL SELECT '0.0055','1','2010' UNION ALL SELECT '0.0055','2','2010' ; IF OBJECT_ID('TempDB..#Refunds','U') IS NOT NULL DROP TABLE #Refunds ; CREATE TABLE #Refunds ( [Acct_no] varchar(20) NULL, [Amount] DECIMAL(28,17) NULL, [Month] [int] NULL, [year] [int] NULL ) ; INSERT INTO #Refunds (Acct_no,Amount, [Month],[year]) SELECT '1271003600','333.107456586203','1','2008' UNION ALL SELECT '1271003600','87.6816131178288','1','2009' UNION ALL SELECT '1271003600','103.602002310821','2','2008' UNION ALL SELECT '1271003600','81.9722910242125','2','2009' UNION ALL SELECT '1271003600','72.3401647559977','3','2008' UNION ALL SELECT '1271003600','86.1625895742832','3','2009' UNION ALL SELECT '1271003600','103.361218989159','4','2008' UNION ALL SELECT '1271003600','84.6065155612006','4','2009' UNION ALL SELECT '1271003600','92.2834655375568','5','2008' UNION ALL SELECT '1271003600','84.6688365382596','5','2009' UNION ALL SELECT '1271003600','89.0687598341003','6','2008' UNION ALL SELECT '1271003600','79.1241550533365','6','2009' UNION ALL SELECT '1271003600','123.131877952709','7','2008' UNION ALL SELECT '1271003600','89.3938505146093','7','2009' UNION ALL SELECT '1271003600','87.2510378856992','8','2007' UNION ALL SELECT '1271003600','94.0687846832911','8','2008' UNION ALL SELECT '1271003600','340.381691069552','8','2009' UNION ALL SELECT '1271003600','80.3632310400272','9','2007' UNION ALL SELECT '1271003600','121.96840107118','9','2008' UNION ALL SELECT '1271003600','87.1037670414245','10','2007' UNION ALL SELECT '1271003600','94.9861134399298','10','2008' UNION ALL SELECT '1271003600','73.8008470324637','11','2007' UNION ALL SELECT '1271003600','78.537373182859','11','2008' UNION ALL SELECT '1271003600','90.7988754111144','12','2007' UNION ALL SELECT '1271003600','88.4954811818565','12','2008' UNION ALL SELECT '1271005700 ','651.764','1','2008' UNION ALL SELECT '1271005700 ','650.766','1','2009' UNION ALL SELECT '1271005700 ','897.994','2','2008' UNION ALL SELECT '1271005700 ','688.486','2','2009' UNION ALL SELECT '1271005700 ','721.466','3','2008' UNION ALL SELECT '1271005700 ','718.654','3','2009' UNION ALL SELECT '1271005700 ','913.072','4','2008' UNION ALL SELECT '1271005700 ','863.224','4','2009' UNION ALL SELECT '1271005700 ','735.081','5','2008' UNION ALL SELECT '1271005700 ','647.368','5','2009' UNION ALL SELECT '1271005700 ','607.721','6','2008' UNION ALL SELECT '1271005700 ','580.359','6','2009' UNION ALL SELECT '1271005700 ','718.969','7','2008' UNION ALL SELECT '1271005700 ','693.378','7','2009' UNION ALL SELECT '1271005700 ','734.005','8','2007' UNION ALL SELECT '1271005700 ','591.788','8','2008' UNION ALL SELECT '1271005700 ','679.773','8','2009' UNION ALL SELECT '1271005700 ','560.296','9','2007' UNION ALL SELECT '1271005700 ','685.047','9','2008' UNION ALL SELECT '1271005700 ','682.834','10','2007' UNION ALL SELECT '1271005700 ','608.706','10','2008' UNION ALL SELECT '1271005700 ','586.172','11','2007' UNION ALL SELECT '1271005700 ','650.191','11','2008' UNION ALL SELECT '1271005700 ','858.672','12','2007' UNION ALL SELECT '1271005700 ','711.708','12','2008' ; --===================================================================================================================== -- Solution to the problem starts here --=====================================================================================================================
--===== Conditionally drop the work table to make reruns for any troubleshooting easier. IF OBJECT_ID('TempDB..#Work','U') IS NOT NULL DROP TABLE #Work; --===== This set of cascading CTE's preconditions the data for processing and later display WITH cteAccount AS ( --=== Find all distinct account numbers so we can apply to all dates for rates without dupes SELECT DISTINCT Acct_No FROM #Refunds ), cteRate AS ( --=== Apply the account number to all date rates without dupes SELECT acct.Acct_No, rate.[Year], rate.[Month], rate.InterestRate/12 AS MonthRate FROM #InterestRates rate CROSS JOIN cteAccount acct ) SELECT ExecSeq = ISNULL(CAST(0 AS INT), 0), ISNULL(rate.Acct_No,'') AS Acct_No, ISNULL(DATEADD(mm,rate.[Month],DATEADD(yy,rate.[YEAR]-1900,0)),0) AS MonthDate, rate.MonthRate, ISNULL(amt.Amount,0) AS Amount, CAST(NULL AS DECIMAL(28,17)) AS InterestAmount, CAST(NULL AS DECIMAL(28,17)) AS Balance INTO #Work FROM cteRate rate LEFT JOIN #Refunds amt ON rate.Acct_No = amt.Acct_No AND rate.[Year] = amt.[Year] AND rate.[Month] = amt.[Month] ;
-- scale up the table yuck but it's a one-off DECLARE @Acct_no INT, @Counter INT SET @Counter = 1
WHILE @Counter < 3220 -- 322 corresponds to 10,000-ish rows BEGIN SELECT @Acct_no = MAX(Acct_no) FROM #Work
INSERT INTO #Work (ExecSeq, Acct_no, MonthDate, MonthRate, Amount) SELECT ExecSeq, Acct_no = CAST(@Acct_no+1 AS VARCHAR(20)), MonthDate, MonthRate, Amount FROM #Work WHERE Acct_no = CAST(@Acct_no AS VARCHAR(20)) -- 25 rows SET @Counter = @Counter + 1 END
-- put in the all important execution order UPDATE w SET ExecSeq = t.ExecSeq FROM #Work w INNER JOIN (SELECT ExecSeq = ROW_NUMBER() OVER(ORDER BY Acct_no, MonthDate), Acct_no, MonthDate FROM #Work) t ON t.Acct_no = w.Acct_no AND t.MonthDate = w.MonthDate
--===== Add the very useful clustered index CREATE CLUSTERED INDEX [ExecSeq] ON #Work ([ExecSeq] ASC) Here's the slightly modified solution:
;WITH Calculator AS ( SELECT ExecSeq, Acct_no, Amount, MonthDate, MonthRate, InterestAmount = ISNULL(InterestAmount,0) FROM #Work w WHERE ExecSeq = 1 UNION ALL SELECT cr.ExecSeq, ISNULL(cr.Acct_no, lr.Acct_no), ISNULL(cr.Amount, 0), cr.MonthDate, cr.MonthRate, InterestAmount = CAST(CASE WHEN cr.Acct_no = lr.Acct_no THEN (lr.Amount + lr.InterestAmount) * (1 + (cr.MonthRate)) ELSE 0 END AS DECIMAL(28,17)) FROM Calculator lr INNER JOIN #Work cr ON cr.ExecSeq = lr.ExecSeq+1 ) SELECT * FROM Calculator ORDER BY ExecSeq OPTION (MAXRECURSION 0)
This runs in 5 seconds, or 3 if the ORDER BY is removed (???). If a single row is selected, say the second last (ExecSeq = 99850), it takes about 3 seconds and the result is correct.
100,000 rows, 5 seconds - that's not a misprint either Uncle Jeff  A running total over a million rows of a simple but properly structured and indexed table can take as little as three or four seconds using this method.
Option 1, version 2.
Nathan, regarding recursive CTE's; "the reference to self in the recursive part represents the results of the previous iteration". It might be nowhere near what SQL Server is doing under the bonnet but it's all you need to know to make recursive CTE's work.
Cheers
ChrisM
Low-hanging fruit picker and defender of the moggies
For better assistance in answering your questions, please read this.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
|
|
|
|
|
SSChampion
        
Group: General Forum Members
Last Login: Today @ 5:10 PM
Points: 10,989,
Visits: 10,535
|
|
Hi Chris,
I had to add a quick index on acct_no to get the table scale-up thing to work in my lifetime (the MAX gets out of hand without it!), but other than that:
This looks to be an interesting way to speed up this sort of recursive query, and it certainly seems to work. There are a number of similarities to the set-based iteration method, like the pre-computing of an order using a ranking function. I want to look at this a bit more deeply before saying much more, but the index spool at the 'lowest level' of the plan seems to be one of the keys to success here - it saves scanning the input table a good many times. I would expect an explicit index on the same keys as the spool to be beneficial, I'll come back to this.
You're right that recursive CTEs can be fast, but it is relative. I sometimes use them to seek down the distinct keys of a large index rather than scanning the whole thing and relying on an aggregate, for example. Where a good set-based alternative exists I usually go with that. Maybe I'll revisit that once I've had a deeper think on this.
I must admit I spent a few minutes expanding the data set to work with the set-based iteration method to see how that went. The recursive CTE ran for just over seven seconds on my old laptop before starting to return results. The set-based iteration method ran for two 
Paul
Paul White SQL Server MVP SQLblog.com @SQL_Kiwi
|
|
|
|