Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Calculating interest query Expand / Collapse
Author
Message
Posted Friday, February 26, 2010 6:17 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, April 2, 2014 11:57 AM
Points: 71, Visits: 172
Well after 1 hour and 11 minutes it reached the maximum recursion of 32767 and terminated. I'm going to try and figure out what exactly this query is doing because it works beautifully when there is only one account but when you add more than one account it breaks. I'm going to edit the sample data to include another account. Sorry about that.
Post #873773
Posted Friday, February 26, 2010 6:26 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:03 AM
Points: 37,065, Visits: 31,628
nathan 7372 (2/26/2010)
Well after 1 hour and 11 minutes it reached the maximum recursion of 32767 and terminated. I'm going to try and figure out what exactly this query is doing because it works beautifully when there is only one account but when you add more than one account it breaks. I'm going to edit the sample data to include another account. Sorry about that.


I was afraid of that when I saw the recursive CTE. They actually tend to be a bit slower than even a cursor. How many rows in total are you processing?

Also, what do you mean by "Refund"? Are you talking about the Amount column?

If all goes well, the example I'm writing should give you what you want at a rate of about a million rows every 7 seconds or so. Nope... that's not a misprint.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #873774
Posted Friday, February 26, 2010 6:32 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, April 2, 2014 11:57 AM
Points: 71, Visits: 172
Yeah I'm working with refunds so refund=amount. I have 222K rows of data which comes out to about 11K accounts. I will definatly be interested to see your query especially if it puts out the performance you claim. Thanks
Post #873775
Posted Friday, February 26, 2010 6:48 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:03 AM
Points: 37,065, Visits: 31,628
nathan 7372 (2/26/2010)
Yeah I'm working with refunds so refund=amount. I have 222K rows of data which comes out to about 11K accounts. I will definatly be interested to see your query especially if it puts out the performance you claim. Thanks


Do you have a running balance column in your real table that you need to have filled in? If so, I need you to post the create statement including all indexes for the real refund table, please. It's important.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #873777
Posted Friday, February 26, 2010 6:51 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, April 2, 2014 11:57 AM
Points: 71, Visits: 172
Currently there is no running balance column in the table. If I need to add one let me know. I also have no indexes at this point. I probably should add some but I'm not entirely sure what would be a useful index at this point.
Post #873778
Posted Friday, February 26, 2010 6:54 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:03 AM
Points: 37,065, Visits: 31,628
nathan 7372 (2/26/2010)
Currently there is no running balance column in the table. If I need to add one let me know. I also have no indexes at this point. I probably should add some but I'm not entirely sure what would be a useful index at this point.


Thanks for the quick feedback. Not to worry... let's peel one potato at a time. First, we'll get you the answer you need and then we'll work on your table. I'm almost done.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #873779
Posted Friday, February 26, 2010 7:22 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:03 AM
Points: 37,065, Visits: 31,628
Ok... here we go. This method is commonly referred to as the "Quirky Update" method and it's appropriately named because if you don't follow all the rules to a "T", it'll get "quirky" on you and it'll come up with the wrong answers. If you do follow the rules, it's rock solid accurate and nasty fast.

There are some rules that won't be apparent in the code below. To wit, I'd really like you to take the time (it's 22 pages long in MS Word) to read the article at the following URL. Don't let the warnings scare you if you follow the rules. Sorry to sound brutal about this but I value your data as much as I do my own so if you think you're smarter than the rules and decide not to follow the rules, then don't use the "Quirky Update" method. Instead, use a well written FORWARD ONLY, READ ONLY, STATIC cursor which (not including a CLR) is the third fastest method (the second fastest is a bit more complicated) for doing running totals of this nature.

http://www.sqlservercentral.com/articles/T-SQL/68467/

Now, if you do decide to follow the rules, prepare to be absolutely amazed. The following code includes the test data you currently have in your original post as well as the code to do your running balances for all accounts in a Temp table. If it seems like it didn't actually run, that's the amazing part... shear blinding speed. Your 222k rows should take about 2 seconds to run and most of that is setting up the Temp Table and the required clustered index. The run time doesn't include the display time.

I should also mention that I shifted from FLOAT to 17 place decimal numbers. You'll find I didn't come up with the same total for the one account as you did because I'm running just a bit more accurate because less rounding is occuring (I think). I also split out the monthly interest so you can see how much it actually is. The "Balance" column is what you were referring to as the "Interest(Month)".

Here's the code:
--=====================================================================================================================
-- This is the original test data including the second account you added this evening
--=====================================================================================================================
IF OBJECT_ID('TempDB..#InterestRates','U') IS NOT NULL
DROP TABLE #InterestRates
;
CREATE TABLE #InterestRates
(
[InterestRate] DECIMAL(28,17) NULL,
[Month] [int] NULL,
[year] [int] NULL
)
;
INSERT INTO #InterestRates
(InterestRate,[Month],[Year])
SELECT '0.105','8','2007' UNION ALL
SELECT '0.105','9','2007' UNION ALL
SELECT '0.105','10','2007' UNION ALL
SELECT '0.105','11','2007' UNION ALL
SELECT '0.105','12','2007' UNION ALL
SELECT '0.105','1','2008' UNION ALL
SELECT '0.105','2','2008' UNION ALL
SELECT '0.105','3','2008' UNION ALL
SELECT '0.105','4','2008' UNION ALL
SELECT '0.105','5','2008' UNION ALL
SELECT '0.105','6','2008' UNION ALL
SELECT '0.105','7','2008' UNION ALL
SELECT '0.105','8','2008' UNION ALL
SELECT '0.105','9','2008' UNION ALL
SELECT '0.105','10','2008' UNION ALL
SELECT '0.105','11','2008' UNION ALL
SELECT '0.105','12','2008' UNION ALL
SELECT '0.105','1','2009' UNION ALL
SELECT '0.105','2','2009' UNION ALL
SELECT '0.105','3','2009' UNION ALL
SELECT '0.105','4','2009' UNION ALL
SELECT '0.105','5','2009' UNION ALL
SELECT '0.105','6','2009' UNION ALL
SELECT '0.0067','7','2009' UNION ALL
SELECT '0.0067','8','2009' UNION ALL
SELECT '0.0067','9','2009' UNION ALL
SELECT '0.0055','10','2009' UNION ALL
SELECT '0.0055','11','2009' UNION ALL
SELECT '0.0055','12','2009' UNION ALL
SELECT '0.0055','1','2010' UNION ALL
SELECT '0.0055','2','2010'
;
IF OBJECT_ID('TempDB..#Refunds','U') IS NOT NULL
DROP TABLE #Refunds
;
CREATE TABLE #Refunds
(
[Acct_no] varchar(20) NULL,
[Amount] DECIMAL(28,17) NULL,
[Month] [int] NULL,
[year] [int] NULL
)
;
INSERT INTO #Refunds
(Acct_no,Amount, [Month],[year])
SELECT '1271003600','333.107456586203','1','2008' UNION ALL
SELECT '1271003600','87.6816131178288','1','2009' UNION ALL
SELECT '1271003600','103.602002310821','2','2008' UNION ALL
SELECT '1271003600','81.9722910242125','2','2009' UNION ALL
SELECT '1271003600','72.3401647559977','3','2008' UNION ALL
SELECT '1271003600','86.1625895742832','3','2009' UNION ALL
SELECT '1271003600','103.361218989159','4','2008' UNION ALL
SELECT '1271003600','84.6065155612006','4','2009' UNION ALL
SELECT '1271003600','92.2834655375568','5','2008' UNION ALL
SELECT '1271003600','84.6688365382596','5','2009' UNION ALL
SELECT '1271003600','89.0687598341003','6','2008' UNION ALL
SELECT '1271003600','79.1241550533365','6','2009' UNION ALL
SELECT '1271003600','123.131877952709','7','2008' UNION ALL
SELECT '1271003600','89.3938505146093','7','2009' UNION ALL
SELECT '1271003600','87.2510378856992','8','2007' UNION ALL
SELECT '1271003600','94.0687846832911','8','2008' UNION ALL
SELECT '1271003600','340.381691069552','8','2009' UNION ALL
SELECT '1271003600','80.3632310400272','9','2007' UNION ALL
SELECT '1271003600','121.96840107118','9','2008' UNION ALL
SELECT '1271003600','87.1037670414245','10','2007' UNION ALL
SELECT '1271003600','94.9861134399298','10','2008' UNION ALL
SELECT '1271003600','73.8008470324637','11','2007' UNION ALL
SELECT '1271003600','78.537373182859','11','2008' UNION ALL
SELECT '1271003600','90.7988754111144','12','2007' UNION ALL
SELECT '1271003600','88.4954811818565','12','2008' UNION ALL
SELECT '1271005700 ','651.764','1','2008' UNION ALL
SELECT '1271005700 ','650.766','1','2009' UNION ALL
SELECT '1271005700 ','897.994','2','2008' UNION ALL
SELECT '1271005700 ','688.486','2','2009' UNION ALL
SELECT '1271005700 ','721.466','3','2008' UNION ALL
SELECT '1271005700 ','718.654','3','2009' UNION ALL
SELECT '1271005700 ','913.072','4','2008' UNION ALL
SELECT '1271005700 ','863.224','4','2009' UNION ALL
SELECT '1271005700 ','735.081','5','2008' UNION ALL
SELECT '1271005700 ','647.368','5','2009' UNION ALL
SELECT '1271005700 ','607.721','6','2008' UNION ALL
SELECT '1271005700 ','580.359','6','2009' UNION ALL
SELECT '1271005700 ','718.969','7','2008' UNION ALL
SELECT '1271005700 ','693.378','7','2009' UNION ALL
SELECT '1271005700 ','734.005','8','2007' UNION ALL
SELECT '1271005700 ','591.788','8','2008' UNION ALL
SELECT '1271005700 ','679.773','8','2009' UNION ALL
SELECT '1271005700 ','560.296','9','2007' UNION ALL
SELECT '1271005700 ','685.047','9','2008' UNION ALL
SELECT '1271005700 ','682.834','10','2007' UNION ALL
SELECT '1271005700 ','608.706','10','2008' UNION ALL
SELECT '1271005700 ','586.172','11','2007' UNION ALL
SELECT '1271005700 ','650.191','11','2008' UNION ALL
SELECT '1271005700 ','858.672','12','2007' UNION ALL
SELECT '1271005700 ','711.708','12','2008'
;
--=====================================================================================================================
-- Solution to the problem starts here
--=====================================================================================================================

--===== Conditionally drop the work table to make reruns for any troubleshooting easier.
IF OBJECT_ID('TempDB..#Work','U') IS NOT NULL
DROP TABLE #Work
;
--===== Declare local variables to support the upcoming running total update
DECLARE @PrevAcct VARCHAR(20),
@PrevBal DECIMAL(28,17),
@IntAmt DECIMAL(28,17)
;
--===== This set of cascading CTE's preconditions the data for processing and later display
WITH
cteAccount AS
( --=== Find all distinct account numbers so we can apply to all dates for rates without dupes
SELECT DISTINCT Acct_No
FROM #Refunds
),
cteRate AS
( --=== Apply the account number to all date rates without dupes
SELECT acct.Acct_No,
rate.[Year],
rate.[Month],
rate.InterestRate/12 AS MonthRate
FROM #InterestRates rate
CROSS JOIN cteAccount acct
)
SELECT ISNULL(rate.Acct_No,'') AS Acct_No,
ISNULL(DATEADD(mm,rate.[Month],DATEADD(yy,rate.[YEAR]-1900,0)),0) AS MonthDate,
rate.MonthRate,
ISNULL(amt.Amount,0) AS Amount,
CAST(NULL AS DECIMAL(28,17)) AS InterestAmount,
CAST(NULL AS DECIMAL(28,17)) AS Balance
INTO #Work
FROM cteRate rate
LEFT JOIN #Refunds amt
ON rate.Acct_No = amt.Acct_No
AND rate.[Year] = amt.[Year]
AND rate.[Month] = amt.[Month]
;
--===== Add the required clustered index
ALTER TABLE #Work
ADD PRIMARY KEY CLUSTERED (Acct_No, MonthDate) WITH FILLFACTOR = 100
;
--===== Do the update to calculate the interest amounts and the running total balance.
-- This will do a million rows in about 3 to 7 seconds
UPDATE #Work
SET @IntAmt = InterestAmount = CASE
WHEN @PrevAcct = Acct_No
THEN (@PrevBal + Amount) * Monthrate
ELSE (Amount) * MonthRate
END,
@PrevBal = Balance = CASE
WHEN @PrevAcct = Acct_No
THEN @PrevBal + Amount + @IntAmt
ELSE Amount + @IntAmt
END,
@PrevAcct = Acct_No
FROM #Work WITH(TABLOCKX) --Not required on a temp table, but good habit for this type of update.
OPTION (MAXDOP 1) --Absolutely essential whether you use a temp table or not.
;

... and here's a bit on how you might display things... we can round the amounts to 2 decimal places, if you'd like.

 SELECT Acct_No, RIGHT(CONVERT(CHAR(11),MonthDate,113),8) AS [Month], Amount, InterestAmount, Balance
FROM #Work
ORDER BY Acct_No, MonthDate
;



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #873785
Posted Saturday, February 27, 2010 2:33 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Saturday, September 13, 2014 8:01 AM
Points: 11,194, Visits: 11,165
For completeness, I'm going to add a new method to the mix (before anyone else jumps in!).

This is a method called 'set-based iteration' by MVP Hugo Kornelis, which was first brought to my attention here in a discussion of one of Jeff's fine articles.

While not quite as fast as the Quirky Update, it is in the same ballpark. It is faster than a recursive CTE on large sets, and even manages to out-perform the best CLR solution I have been able to put together, at least for some data sets.

The key to performance with the set-based iteration method is to have at least several groups in the data. It operates on a divide-and-conquer basis, and the unit of division is the group. In the present case, these groups are accounts - in general, they are the things that reset the running total. Worst case, if there is only one group, this method performs worse than an optimized cursor - so be aware of that.

In the sample data provided, there are two accounts, which is enough to make the method viable, but certainly not optimal. My guess is that the real data set contains many accounts, so that should be all good.

I have shamelessly re-used Jeff's fine test script, up to the point marked 'Solution to the problem starts here'. The following code runs from that point, and again draws heavily on Jeff's work. I hope he won't mind.

-- A work table to hold the reformatted data, and
-- ultimately, the results
CREATE TABLE #Work
(
Acct_No VARCHAR(20) NOT NULL,
MonthDate DATETIME NOT NULL,
MonthRate DECIMAL(19,12) NOT NULL,
Amount DECIMAL(19,12) NOT NULL,
InterestAmount DECIMAL(19,12) NOT NULL,
RunningTotal DECIMAL(19,12) NOT NULL,
RowRank BIGINT NOT NULL
);

-- Slightly modified from Jeff's example, to prepare
-- for the set-based iteration method
WITH Accounts
AS (
-- Get a list of the account numbers
SELECT DISTINCT Acct_No
FROM #Refunds
),
Rates
AS (
-- Apply all the accounts to all the rates
SELECT A.Acct_No,
R.[Year],
R.[Month],
MonthRate = R.InterestRate / 12
FROM #InterestRates R
CROSS
JOIN Accounts A
),
BaseData
AS (
-- The basic data we need to work with
SELECT Acct_No = ISNULL(R.Acct_No,''),
MonthDate = ISNULL(DATEADD(MONTH, R.[Month], DATEADD(YEAR, R.[year] - 1900, 0)), 0),
R.MonthRate,
Amount = ISNULL(RF.Amount,0),
InterestAmount = ISNULL(RF.Amount,0) * R.MonthRate,
RunningTotal = ISNULL(RF.Amount,0)
FROM Rates R
LEFT
JOIN #Refunds RF
ON RF.Acct_No = R.Acct_No
AND RF.[Year] = R.[Year]
AND RF.[Month] = R.[Month]
)
-- Basic data plus a rank id, numbering the rows by MonthDate, and resetting to 1 for each new Account
INSERT #Work
(Acct_No, MonthDate, MonthRate, Amount, InterestAmount, RunningTotal, RowRank)
SELECT BD.Acct_No, BD.MonthDate, BD.MonthRate, BD.Amount, BD.InterestAmount, BD.RunningTotal,
RowRank = RANK() OVER (PARTITION BY BD.Acct_No ORDER BY MonthDate)
FROM BaseData BD;

-- An index to speed the next stage (different from that used with the Quirky Update method)
CREATE UNIQUE CLUSTERED INDEX nc1 ON #Work (RowRank, Acct_No);

-- Iteration variables
DECLARE @Rank BIGINT,
@RowCount INTEGER;

-- Initialize
SELECT @Rank = 1,
@RowCount = 1;

-- This is the iteration bit, processes a rank id per iteration
-- The number of rows processed with each iteration is equal to the number of groups in the data
-- More groups --> greater efficiency
WHILE (1 = 1)
BEGIN
SET @Rank = @Rank + 1;

-- Set-based update with running totals for the current rank id
UPDATE This
SET InterestAmount = (Previous.RunningTotal + This.Amount) * This.MonthRate,
RunningTotal = Previous.RunningTotal + This.Amount + (Previous.RunningTotal + This.Amount) * This.MonthRate
FROM #Work This
JOIN #Work Previous
ON Previous.Acct_No = This.Acct_No
AND Previous.RowRank = @Rank - 1
WHERE This.RowRank = @Rank;

IF (@@ROWCOUNT = 0) BREAK;
END;

-- Show the results in natural order
SELECT *
FROM #Work
ORDER BY
Acct_No, RowRank;

I continue to recommend the Quirky Update over this method, for sheer speed, but this method does have the advantage of being fully documented and supported. I present it as 'Option 2'.

Paul

edit: to reflect the improvements in the recursive CTE method




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #873933
Posted Saturday, February 27, 2010 4:36 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 1:22 AM
Points: 1,094, Visits: 6,658
Paul White (2/27/2010)
It is certainly orders of magnitude faster than a recursive CTE on large sets


Hi Paul, an excellent solution as always, and a method to remember.

There are a couple of things I've noticed while working with CTE's over the last few weeks of using them in anger. The first is that, even with relatively small data sets, chaining CTE's can be a really bad idea. The optimiser appears to screw things up, row counts go through the roof and runtimes rapidly escalate. I've got an excellent example of this where breaking the chain by spooling into a temp table improved performance about 20-fold on a modest pagination routine. Funny thing is, the very first run took less than a second, before the optimiser lost it.
The other thing I've noticed is that recursive CTE's can be blisteringly fast provided that the required conditions are met.

So, reusing Jeff's sample data (thanks Jeff), clumsily ramping up to about about 3,000 accounts across about 100,000 rows, and processing a data set which already contains a contiguous row number in the correct order for execution, here goes...

First, the sample data:
--=====================================================================================================================
-- This is the original test data including the second account you added this evening
--=====================================================================================================================
SET NOCOUNT ON

IF OBJECT_ID('TempDB..#InterestRates','U') IS NOT NULL
DROP TABLE #InterestRates
;
CREATE TABLE #InterestRates
(
[InterestRate] DECIMAL(28,17) NULL,
[Month] [int] NULL,
[year] [int] NULL
)
;
INSERT INTO #InterestRates
(InterestRate,[Month],[Year])
SELECT '0.105','8','2007' UNION ALL
SELECT '0.105','9','2007' UNION ALL
SELECT '0.105','10','2007' UNION ALL
SELECT '0.105','11','2007' UNION ALL
SELECT '0.105','12','2007' UNION ALL
SELECT '0.105','1','2008' UNION ALL
SELECT '0.105','2','2008' UNION ALL
SELECT '0.105','3','2008' UNION ALL
SELECT '0.105','4','2008' UNION ALL
SELECT '0.105','5','2008' UNION ALL
SELECT '0.105','6','2008' UNION ALL
SELECT '0.105','7','2008' UNION ALL
SELECT '0.105','8','2008' UNION ALL
SELECT '0.105','9','2008' UNION ALL
SELECT '0.105','10','2008' UNION ALL
SELECT '0.105','11','2008' UNION ALL
SELECT '0.105','12','2008' UNION ALL
SELECT '0.105','1','2009' UNION ALL
SELECT '0.105','2','2009' UNION ALL
SELECT '0.105','3','2009' UNION ALL
SELECT '0.105','4','2009' UNION ALL
SELECT '0.105','5','2009' UNION ALL
SELECT '0.105','6','2009' UNION ALL
SELECT '0.0067','7','2009' UNION ALL
SELECT '0.0067','8','2009' UNION ALL
SELECT '0.0067','9','2009' UNION ALL
SELECT '0.0055','10','2009' UNION ALL
SELECT '0.0055','11','2009' UNION ALL
SELECT '0.0055','12','2009' UNION ALL
SELECT '0.0055','1','2010' UNION ALL
SELECT '0.0055','2','2010'
;
IF OBJECT_ID('TempDB..#Refunds','U') IS NOT NULL
DROP TABLE #Refunds
;
CREATE TABLE #Refunds
(
[Acct_no] varchar(20) NULL,
[Amount] DECIMAL(28,17) NULL,
[Month] [int] NULL,
[year] [int] NULL
)
;
INSERT INTO #Refunds
(Acct_no,Amount, [Month],[year])
SELECT '1271003600','333.107456586203','1','2008' UNION ALL
SELECT '1271003600','87.6816131178288','1','2009' UNION ALL
SELECT '1271003600','103.602002310821','2','2008' UNION ALL
SELECT '1271003600','81.9722910242125','2','2009' UNION ALL
SELECT '1271003600','72.3401647559977','3','2008' UNION ALL
SELECT '1271003600','86.1625895742832','3','2009' UNION ALL
SELECT '1271003600','103.361218989159','4','2008' UNION ALL
SELECT '1271003600','84.6065155612006','4','2009' UNION ALL
SELECT '1271003600','92.2834655375568','5','2008' UNION ALL
SELECT '1271003600','84.6688365382596','5','2009' UNION ALL
SELECT '1271003600','89.0687598341003','6','2008' UNION ALL
SELECT '1271003600','79.1241550533365','6','2009' UNION ALL
SELECT '1271003600','123.131877952709','7','2008' UNION ALL
SELECT '1271003600','89.3938505146093','7','2009' UNION ALL
SELECT '1271003600','87.2510378856992','8','2007' UNION ALL
SELECT '1271003600','94.0687846832911','8','2008' UNION ALL
SELECT '1271003600','340.381691069552','8','2009' UNION ALL
SELECT '1271003600','80.3632310400272','9','2007' UNION ALL
SELECT '1271003600','121.96840107118','9','2008' UNION ALL
SELECT '1271003600','87.1037670414245','10','2007' UNION ALL
SELECT '1271003600','94.9861134399298','10','2008' UNION ALL
SELECT '1271003600','73.8008470324637','11','2007' UNION ALL
SELECT '1271003600','78.537373182859','11','2008' UNION ALL
SELECT '1271003600','90.7988754111144','12','2007' UNION ALL
SELECT '1271003600','88.4954811818565','12','2008' UNION ALL
SELECT '1271005700 ','651.764','1','2008' UNION ALL
SELECT '1271005700 ','650.766','1','2009' UNION ALL
SELECT '1271005700 ','897.994','2','2008' UNION ALL
SELECT '1271005700 ','688.486','2','2009' UNION ALL
SELECT '1271005700 ','721.466','3','2008' UNION ALL
SELECT '1271005700 ','718.654','3','2009' UNION ALL
SELECT '1271005700 ','913.072','4','2008' UNION ALL
SELECT '1271005700 ','863.224','4','2009' UNION ALL
SELECT '1271005700 ','735.081','5','2008' UNION ALL
SELECT '1271005700 ','647.368','5','2009' UNION ALL
SELECT '1271005700 ','607.721','6','2008' UNION ALL
SELECT '1271005700 ','580.359','6','2009' UNION ALL
SELECT '1271005700 ','718.969','7','2008' UNION ALL
SELECT '1271005700 ','693.378','7','2009' UNION ALL
SELECT '1271005700 ','734.005','8','2007' UNION ALL
SELECT '1271005700 ','591.788','8','2008' UNION ALL
SELECT '1271005700 ','679.773','8','2009' UNION ALL
SELECT '1271005700 ','560.296','9','2007' UNION ALL
SELECT '1271005700 ','685.047','9','2008' UNION ALL
SELECT '1271005700 ','682.834','10','2007' UNION ALL
SELECT '1271005700 ','608.706','10','2008' UNION ALL
SELECT '1271005700 ','586.172','11','2007' UNION ALL
SELECT '1271005700 ','650.191','11','2008' UNION ALL
SELECT '1271005700 ','858.672','12','2007' UNION ALL
SELECT '1271005700 ','711.708','12','2008'
;
--=====================================================================================================================
-- Solution to the problem starts here
--=====================================================================================================================

--===== Conditionally drop the work table to make reruns for any troubleshooting easier.
IF OBJECT_ID('TempDB..#Work','U') IS NOT NULL
DROP TABLE #Work;


--===== This set of cascading CTE's preconditions the data for processing and later display
WITH
cteAccount AS
( --=== Find all distinct account numbers so we can apply to all dates for rates without dupes
SELECT DISTINCT Acct_No
FROM #Refunds
),
cteRate AS
( --=== Apply the account number to all date rates without dupes
SELECT acct.Acct_No,
rate.[Year],
rate.[Month],
rate.InterestRate/12 AS MonthRate
FROM #InterestRates rate
CROSS JOIN cteAccount acct
)
SELECT ExecSeq = ISNULL(CAST(0 AS INT), 0),
ISNULL(rate.Acct_No,'') AS Acct_No,
ISNULL(DATEADD(mm,rate.[Month],DATEADD(yy,rate.[YEAR]-1900,0)),0) AS MonthDate,
rate.MonthRate,
ISNULL(amt.Amount,0) AS Amount,
CAST(NULL AS DECIMAL(28,17)) AS InterestAmount,
CAST(NULL AS DECIMAL(28,17)) AS Balance
INTO #Work
FROM cteRate rate
LEFT JOIN #Refunds amt
ON rate.Acct_No = amt.Acct_No
AND rate.[Year] = amt.[Year]
AND rate.[Month] = amt.[Month]
;


-- scale up the table yuck but it's a one-off
DECLARE @Acct_no INT, @Counter INT
SET @Counter = 1

WHILE @Counter < 3220 -- 322 corresponds to 10,000-ish rows
BEGIN
SELECT @Acct_no = MAX(Acct_no) FROM #Work

INSERT INTO #Work (ExecSeq, Acct_no, MonthDate, MonthRate, Amount)
SELECT ExecSeq, Acct_no = CAST(@Acct_no+1 AS VARCHAR(20)), MonthDate, MonthRate, Amount
FROM #Work WHERE Acct_no = CAST(@Acct_no AS VARCHAR(20)) -- 25 rows
SET @Counter = @Counter + 1
END

-- put in the all important execution order
UPDATE w SET ExecSeq = t.ExecSeq
FROM #Work w
INNER JOIN (SELECT ExecSeq = ROW_NUMBER() OVER(ORDER BY Acct_no, MonthDate), Acct_no, MonthDate FROM #Work) t
ON t.Acct_no = w.Acct_no AND t.MonthDate = w.MonthDate

--===== Add the very useful clustered index
CREATE CLUSTERED INDEX [ExecSeq] ON #Work ([ExecSeq] ASC)

Here's the slightly modified solution:
;WITH Calculator AS (
SELECT ExecSeq,
Acct_no,
Amount,
MonthDate,
MonthRate,
InterestAmount = ISNULL(InterestAmount,0)
FROM #Work w
WHERE ExecSeq = 1
UNION ALL
SELECT cr.ExecSeq,
ISNULL(cr.Acct_no, lr.Acct_no),
ISNULL(cr.Amount, 0),
cr.MonthDate,
cr.MonthRate,
InterestAmount = CAST(CASE WHEN cr.Acct_no = lr.Acct_no THEN (lr.Amount + lr.InterestAmount) * (1 + (cr.MonthRate)) ELSE 0 END AS DECIMAL(28,17))
FROM Calculator lr
INNER JOIN #Work cr ON cr.ExecSeq = lr.ExecSeq+1
)
SELECT * FROM Calculator ORDER BY ExecSeq OPTION (MAXRECURSION 0)


This runs in 5 seconds, or 3 if the ORDER BY is removed (???). If a single row is selected, say the second last (ExecSeq = 99850), it takes about 3 seconds and the result is correct.

100,000 rows, 5 seconds - that's not a misprint either Uncle Jeff
A running total over a million rows of a simple but properly structured and indexed table can take as little as three or four seconds using this method.

Option 1, version 2.

Nathan, regarding recursive CTE's; "the reference to self in the recursive part represents the results of the previous iteration". It might be nowhere near what SQL Server is doing under the bonnet but it's all you need to know to make recursive CTE's work.

Cheers

ChrisM



Low-hanging fruit picker and defender of the moggies





For better assistance in answering your questions, please read this.




Understanding and using APPLY, (I) and (II) Paul White

Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Post #873955
Posted Saturday, February 27, 2010 5:17 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Saturday, September 13, 2014 8:01 AM
Points: 11,194, Visits: 11,165
Hi Chris,

I had to add a quick index on acct_no to get the table scale-up thing to work in my lifetime (the MAX gets out of hand without it!), but other than that:

This looks to be an interesting way to speed up this sort of recursive query, and it certainly seems to work. There are a number of similarities to the set-based iteration method, like the pre-computing of an order using a ranking function. I want to look at this a bit more deeply before saying much more, but the index spool at the 'lowest level' of the plan seems to be one of the keys to success here - it saves scanning the input table a good many times. I would expect an explicit index on the same keys as the spool to be beneficial, I'll come back to this.

You're right that recursive CTEs can be fast, but it is relative. I sometimes use them to seek down the distinct keys of a large index rather than scanning the whole thing and relying on an aggregate, for example. Where a good set-based alternative exists I usually go with that. Maybe I'll revisit that once I've had a deeper think on this.

I must admit I spent a few minutes expanding the data set to work with the set-based iteration method to see how that went. The recursive CTE ran for just over seven seconds on my old laptop before starting to return results. The set-based iteration method ran for two

Paul




Paul White
SQL Server MVP
SQLblog.com
@SQL_Kiwi
Post #873961
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse