For completeness, I'm going to add a new method to the mix (before anyone else jumps in!).
This is a method called 'set-based iteration' by MVP Hugo Kornelis, which was first brought to my attention here in a discussion of one of Jeff's fine articles.
While not quite as fast as the Quirky Update, it is in the same ballpark. It is faster than a recursive CTE on large sets, and even manages to out-perform the best CLR solution I have been able to put together, at least for some data sets.
The key to performance with the set-based iteration method is to have at least several groups in the data. It operates on a divide-and-conquer basis, and the unit of division is the group. In the present case, these groups are accounts - in general, they are the things that reset the running total. Worst case, if there is only one group, this method performs worse than an optimized cursor - so be aware of that.
In the sample data provided, there are two accounts, which is enough to make the method viable, but certainly not optimal. My guess is that the real data set contains many accounts, so that should be all good.
I have shamelessly re-used Jeff's fine test script, up to the point marked 'Solution to the problem starts here'. The following code runs from that point, and again draws heavily on Jeff's work. I hope he won't mind.
-- A work table to hold the reformatted data, and
-- ultimately, the results
CREATE TABLE #Work
(
Acct_No VARCHAR(20) NOT NULL,
MonthDate DATETIME NOT NULL,
MonthRate DECIMAL(19,12) NOT NULL,
Amount DECIMAL(19,12) NOT NULL,
InterestAmount DECIMAL(19,12) NOT NULL,
RunningTotal DECIMAL(19,12) NOT NULL,
RowRank BIGINT NOT NULL
);
-- Slightly modified from Jeff's example, to prepare
-- for the set-based iteration method
WITH Accounts
AS (
-- Get a list of the account numbers
SELECT DISTINCT Acct_No
FROM #Refunds
),
Rates
AS (
-- Apply all the accounts to all the rates
SELECT A.Acct_No,
R.[Year],
R.[Month],
MonthRate = R.InterestRate / 12
FROM #InterestRates R
CROSS
JOIN Accounts A
),
BaseData
AS (
-- The basic data we need to work with
SELECT Acct_No = ISNULL(R.Acct_No,''),
MonthDate = ISNULL(DATEADD(MONTH, R.[Month], DATEADD(YEAR, R.[year] - 1900, 0)), 0),
R.MonthRate,
Amount = ISNULL(RF.Amount,0),
InterestAmount = ISNULL(RF.Amount,0) * R.MonthRate,
RunningTotal = ISNULL(RF.Amount,0)
FROM Rates R
LEFT
JOIN #Refunds RF
ON RF.Acct_No = R.Acct_No
AND RF.[Year] = R.[Year]
AND RF.[Month] = R.[Month]
)
-- Basic data plus a rank id, numbering the rows by MonthDate, and resetting to 1 for each new Account
INSERT #Work
(Acct_No, MonthDate, MonthRate, Amount, InterestAmount, RunningTotal, RowRank)
SELECT BD.Acct_No, BD.MonthDate, BD.MonthRate, BD.Amount, BD.InterestAmount, BD.RunningTotal,
RowRank = RANK() OVER (PARTITION BY BD.Acct_No ORDER BY MonthDate)
FROM BaseData BD;
-- An index to speed the next stage (different from that used with the Quirky Update method)
CREATE UNIQUE CLUSTERED INDEX nc1 ON #Work (RowRank, Acct_No);
-- Iteration variables
DECLARE @Rank BIGINT,
@RowCount INTEGER;
-- Initialize
SELECT @Rank = 1,
@RowCount = 1;
-- This is the iteration bit, processes a rank id per iteration
-- The number of rows processed with each iteration is equal to the number of groups in the data
-- More groups --> greater efficiency
WHILE (1 = 1)
BEGIN
SET @Rank = @Rank + 1;
-- Set-based update with running totals for the current rank id
UPDATE This
SET InterestAmount = (Previous.RunningTotal + This.Amount) * This.MonthRate,
RunningTotal = Previous.RunningTotal + This.Amount + (Previous.RunningTotal + This.Amount) * This.MonthRate
FROM #Work This
JOIN #Work Previous
ON Previous.Acct_No = This.Acct_No
AND Previous.RowRank = @Rank - 1
WHERE This.RowRank = @Rank;
IF (@@ROWCOUNT = 0) BREAK;
END;
-- Show the results in natural order
SELECT *
FROM #Work
ORDER BY
Acct_No, RowRank;
I continue to recommend the Quirky Update over this method, for sheer speed, but this method does have the advantage of being fully documented and supported. I present it as 'Option 2'.
Paul
edit: to reflect the improvements in the recursive CTE method
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi