Replace calculation in cursor with set based logic

  • Hello and thanks if you can help. The calculation in the cursor is correct. Alas my attempt at doing the calculation in a select statement is not.  SQL Server 2019

    DROP TABLE IF EXISTS #Test

    CREATE TABLE #Test (PricingDate DATE NOT NULL,Price DECIMAL (5,2) NOT NULL,Factor DECIMAL (9,7) NULL)
    INSERT INTO #Test VALUES ('2025-04-01',11.94,NULL)
    INSERT INTO #Test VALUES ('2025-04-02',11.39,-4.6063581)
    INSERT INTO #Test VALUES ('2025-04-03',11.88,4.3020044)
    INSERT INTO #Test VALUES ('2025-04-04',11.83,-0.0252615)
    INSERT INTO #Test VALUES ('2025-04-05',12.19,3.0431671)
    INSERT INTO #Test VALUES ('2025-04-06',12.64,3.691608)
    --SELECT * FROM #Test

    DROP TABLE IF EXISTS #Output

    CREATE TABLE #Output (PricingDate DATE NOT NULL,Price DECIMAL (5,2) NOT NULL,Factor DECIMAL (9,7) NULL,Calculation DECIMAL (9,7) NULL,)

    DECLARE
    @PricingDate DATE,
    @Price DECIMAL (5,2),
    @Factor DECIMAL (9,7),
    @Calculation DECIMAL (9,7),
    @PreviousCalculation DECIMAL (9,7),
    @Count SMALLINT = 1

    DECLARE TestCursor CURSOR FOR SELECT PricingDate,Price,Factor FROM #Test ORDER BY PricingDate

    OPEN TestCursor
    FETCH NEXT FROM TestCursor into @PricingDate,@Price,@Factor
    WHILE @@FETCH_STATUS = 0
    BEGIN
    -- SELECT @PricingDate,@Price,@Factor
    IF @Count = 1
    BEGIN
    --SET @Calculation = (1 + (COALESCE(@Factor,0)/100)) * @Price
    SET @Calculation = @Price
    SET @PreviousCalculation = @Calculation
    SET @Count = @Count + 1
    INSERT INTO #Output SELECT @PricingDate,@Price,@Factor,@Calculation
    --SELECT @PricingDate AS PricingDate,@Price AS Price,@Factor AS Factor,@Calculation AS Calculation
    END
    ELSE
    BEGIN
    SET @Calculation = (1 + (COALESCE(@Factor,0)/100)) * @PreviousCalculation
    SET @Count = @Count + 1
    SET @PreviousCalculation = @Calculation
    INSERT INTO #Output SELECT @PricingDate,@Price,@Factor,@Calculation
    --SELECT @PricingDate AS PricingDate,@Price AS Price,@Factor AS Factor,@Calculation AS Calculation
    END
    FETCH NEXT FROM TestCursor into @PricingDate,@Price,@Factor
    END

    CLOSE TestCursor;
    DEALLOCATE TestCursor;

    SELECT
    ROW_NUMBER() OVER (ORDER BY PricingDate) AS RowNumber,
    PricingDate,
    Price,
    LAG(Price) OVER (ORDER BY PricingDate) AS LagPrice,
    Factor,
    LAG(Factor) OVER (ORDER BY PricingDate) AS LagFactor,
    Calculation,
    --NewCalculation
    CASE
    WHEN ROW_NUMBER() OVER (ORDER BY PricingDate) = 1
    THEN Price
    WHEN ROW_NUMBER() OVER (ORDER BY PricingDate) >= 2
    THEN
    (1 + (COALESCE(@Factor,0)/100)) *
    (
    (1 + (COALESCE(LAG(Factor) OVER (ORDER BY PricingDate),0)/100))
    *
    (1 + (COALESCE(@Factor,0)/100)) * (LAG(Price) OVER (ORDER BY PricingDate))
    )
    END AS NewCalculation
    FROM #Output
  • If you're trying to create a running total, use a windowing function.

    SUM(1 + COALESCE(Fact0r,0)/100) OVER (ORDER BY PricingDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)

    ?

    Just make sure your "window" defining how which rows you want to accumulate are right.

  • Thanks I'll take a look at that windowing method. I don't think this is a running total but rather a calculation which needs the calculation from the prior row.

    Also I think I confused matters because in my attempted NewCalculation I was referencing a parameter (@Factor) rather than the Factor column from the #Output temp table. So my attempt should have been the below although the correction still does not get my desired result.

    SELECT
    ROW_NUMBER() OVER (ORDER BY PricingDate) AS RowNumber,
    PricingDate,
    Price,
    LAG(Price) OVER (ORDER BY PricingDate) AS LagPrice,
    Factor,
    LAG(Factor) OVER (ORDER BY PricingDate) AS LagFactor,
    Calculation,
    --NewCalculation
    CASE
    WHEN ROW_NUMBER() OVER (ORDER BY PricingDate) = 1
    THEN Price
    WHEN ROW_NUMBER() OVER (ORDER BY PricingDate) >= 1
    THEN
    (1 + (COALESCE(Factor,0)/100)) *
    (
    (1 + (COALESCE(LAG(Factor) OVER (ORDER BY PricingDate),0)/100))
    *
    (1 + (COALESCE(Factor,0)/100)) * (LAG(Price) OVER (ORDER BY PricingDate))
    )
    END AS NewCalculation
    FROM #Output

     

  • My current thinking is this needs to be done using recursion. The calculation need to reference the previous calculation which needs to reference the previous calculation etc. I was hoping to do this in a view. I've always stayed away from recursive queries for performance reasons and it just seems like a cursor under the hood. Still looking for help on this and would be much appreciative if someone could point me in the right direction.

  • Why are you using these data types with this numerical precision?  By making @Factor DECIMAL (9,7) and applying this percentage to a price it adds significance to decimal places to the right of the 4th place.  This means there will be an inscrutable influence on calculations where the balance is carried forward, which is what's happening.  One possible way to handle this could be to CAST the intermediate calculations to the MONEY data type.  However, I wouldn't recommend it.  Why is this precision called for?  Without knowing more it's difficult to say anything.

    Just duplicating the calculation without worrying about issues related to precision (which in this case are a severe pita imo) maybe this

    with lag_cte as (
    select *, lag(price) over (order by PricingDate) lag_price
    from #Test
    cross apply (values ((1 + (isnull(Factor,0)/100)))) v(factor_pct))
    select *, lag_price*factor_pct
    from lag_cte;

     

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • A quirky update seems to work, although that requires adding a Calculation column to the original table and making sure the original table is clustered by PricingDate, viz:

    CREATE TABLE #Test (PricingDate DATE NOT NULL PRIMARY KEY CLUSTERED,Price DECIMAL (5,2) NOT NULL,Factor DECIMAL (9,7) NULL,Calculation DECIMAL(9, 7))

    DECLARE @Calculation decimal(9, 7);
    DECLARE @Factor decimal(9, 7);
    DECLARE @PreviousCalculation decimal(9, 7);
    DECLARE @Price decimal(5, 2)
    DECLARE @PricingDate date;

    ;WITH CteTest AS (
    /* Cte not required here, oc; I used it to try to find a way
    to not have to add a Calculation column to main table */
    SELECT *
    FROM #Test
    )
    UPDATE CteTest
    SET
    @PreviousCalculation = CASE WHEN Factor IS NULL THEN Price ELSE @Calculation END,
    @Calculation = Calculation = (1 + (COALESCE(Factor,0)/100)) * @PreviousCalculation
    FROM CteTest WITH (TABLOCKX)

    SELECT * FROM #Output ORDER BY PricingDate;
    SELECT * FROM #Test ORDER BY PricingDate;

    SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

  • ScottPletcher wrote:

    A quirky update seems to work

    Thanks Scott, appreciate it. I am trying to build a view since that is what my downstream processing expects. I forgot about the quirky update. It's always good to go back and read Jeff's article. I am studying Steve's replay and will respond in depth there.

  • Steve Collins wrote:

    Why are you using these data types with this numerical precision?   

    The precision issue is a problem. I have a legacy system with certain data types and a cursor. Then a new system I want to migrate to with different data types where I am trying to use a set based select statement. Then I have my test harness here where I just used data types for demonstration.

    Ideally I'd have a subject matter expert to tell me what precision/rounding to use. Realistically the legacy system is going to be considered 'correct'. My goal was to replicate the legacy system since its the only version of the truth I have.

    The Cursor and Quirky methods return the same precision as my legacy. Also matches and excel proof of concept I have.

    The CrossApply method strays and I assume this straying will increase over longer periods of time.

    I guess I am struggling to understand why Cursor and Quirky replicate my legacy results and the CrossApply method won't. Can I make a set based method consistent with my legacy results by appropriate rounding? I have tried but seems to get worse. Is the set based method just correct differently or is it flawed inherently with this type of problem. Would a windowing method have the same issues as the CrossApply?

    Notes on the revised script below. I am showing the Cursor, Quirky and CrossApply methods. I have added an EntityID column to better represent my actual data. I can't make EntityID work with the CrossApply method.

    Much thanks to this community.

    DROP TABLE IF EXISTS #Test

    --Test data
    CREATE TABLE #Test (EntityID INT NOT NULL,PricingDate DATE NOT NULL,Price DECIMAL (5,2) NOT NULL,Factor DECIMAL (9,7) NULL,Calculation DECIMAL(9, 7))--Add calculation to support Quirky
    INSERT INTO #Test VALUES (1,'2025-04-01',11.94,NULL,NULL)
    INSERT INTO #Test VALUES (1,'2025-04-02',11.39,-4.6063581,NULL)
    INSERT INTO #Test VALUES (1,'2025-04-03',11.88,4.3020044,NULL)
    INSERT INTO #Test VALUES (1,'2025-04-04',11.83,-0.0252615,NULL)
    INSERT INTO #Test VALUES (1,'2025-04-05',12.19,3.0431671,NULL)
    INSERT INTO #Test VALUES (1,'2025-04-06',12.64,3.691608,NULL)

    INSERT INTO #Test VALUES (2,'2025-04-01',11.94,NULL,NULL)
    INSERT INTO #Test VALUES (2,'2025-04-02',11.39,-4.6063581,NULL)
    INSERT INTO #Test VALUES (2,'2025-04-03',11.88,4.3020044,NULL)
    INSERT INTO #Test VALUES (2,'2025-04-04',11.83,-0.0252615,NULL)
    INSERT INTO #Test VALUES (2,'2025-04-05',12.19,3.0431671,NULL)
    INSERT INTO #Test VALUES (2,'2025-04-06',12.64,3.691608,NULL)

    ALTER TABLE #Test ADD PRIMARY KEY CLUSTERED (EntityID,PricingDate)

    --Table #Output is used to capture cursor output
    DROP TABLE IF EXISTS #Output
    CREATE TABLE #Output (EntityID INT NOT NULL,PricingDate DATE NOT NULL,Price DECIMAL (5,2) NOT NULL,Factor DECIMAL (9,7) NULL,Calculation DECIMAL (9,7) NULL,)

    --Cursor paramaters
    DECLARE @EntityID INT,@EntityIDPrevious INT = 0,@PricingDate DATE,@Price DECIMAL (5,2),@Factor DECIMAL (9,7),@Calculation DECIMAL (9,7),@PreviousCalculation DECIMAL (9,7),@Count SMALLINT = 1

    --Cursor processing
    DECLARE TestCursor CURSOR FOR SELECT EntityID,PricingDate,Price,Factor FROM #Test ORDER BY EntityID,PricingDate

    OPEN TestCursor
    FETCH NEXT FROM TestCursor INTO @EntityID,@PricingDate,@Price,@Factor
    WHILE @@FETCH_STATUS = 0
    BEGIN

    IF @EntityID <> @EntityIDPrevious
    BEGIN
    SET @Calculation = @Price
    SET @PreviousCalculation = @Calculation
    --SET @Count = @Count + 1
    SET @EntityIDPrevious = @EntityID
    INSERT INTO #Output SELECT @EntityID,@PricingDate,@Price,@Factor,@Calculation
    --SELECT @PricingDate AS PricingDate,@Price AS Price,@Factor AS Factor,@Calculation AS Calculation
    END
    ELSE
    BEGIN
    SET @Calculation = (1 + (COALESCE(@Factor,0)/100)) * @PreviousCalculation
    --SET @Count = @Count + 1
    SET @EntityIDPrevious = @EntityID
    SET @PreviousCalculation = @Calculation
    INSERT INTO #Output SELECT @EntityID,@PricingDate,@Price,@Factor,@Calculation
    --SELECT @PricingDate AS PricingDate,@Price AS Price,@Factor AS Factor,@Calculation AS Calculation
    END
    FETCH NEXT FROM TestCursor INTO @EntityID,@PricingDate,@Price,@Factor
    END

    CLOSE TestCursor;
    DEALLOCATE TestCursor;

    SELECT 'Cursor',* FROM #Output ORDER BY PricingDate, EntityID

    DROP TABLE IF EXISTS #Quirky
    SELECT * INTO #Quirky FROM #Test

    ;WITH CteTest AS (
    /* Cte not required here, oc; I used it to try to find a way
    to not have to add a Calculation column to main table */
    SELECT *
    FROM #Quirky
    )
    UPDATE CteTest
    SET
    @PreviousCalculation = CASE WHEN Factor IS NULL THEN Price ELSE @Calculation END,
    @Calculation = Calculation = (1 + (COALESCE(Factor,0)/100)) * @PreviousCalculation
    FROM CteTest WITH (TABLOCKX)

    SELECT 'Quirky',* FROM #Quirky ORDER BY PricingDate;


    WITH lag_cte AS
    (
    SELECT *, LAG(Price) OVER (ORDER BY PricingDate) lag_price
    FROM #Test
    CROSS APPLY (VALUES ((1 + (ISNULL(Factor,0)/100)))) v(factor_pct)
    WHERE EntityID = 1
    )
    SELECT 'CrossApply',*, lag_price*factor_pct
    FROM lag_cte;
  • Seems like you should be able to CAST the end result of the query to get less precision. The only problem is if you're losing precision in the CTE.

  • I would change the following

    (1 + (COALESCE(Factor,0)/100)) * @PreviousCalculation

    to be

    (1 + (COALESCE(Factor,0.)/100.)) * @PreviousCalculation -- notice the "dot" on the divide (100.) and on coalesce of factor-

    depending on the values you may also have silent truncation (instead of rounding which would be default in other tools) - you may need to add it to the calculation.

  • My query was insufficient because it applied the factor percentage to the lagged price and not to the lagged calculated price.  It seems recursion is unavoidable.  This query uses a recursive CTE

    with recur_cte as (
    select PricingDate, Price, Factor, cast(Price as decimal(9,7)) Calculation
    from #Test
    where Factor is null
    union all
    select t.PricingDate, t.Price, t.Factor, cast((1 + (isnull(t.Factor,0)/100)) * r.Calculation as decimal(9,7))
    from #Test t
    join recur_cte r on t.PricingDate = dateadd(day, 1, r.PricingDate))
    select *
    from recur_cte
    order by PricingDate;

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Regarding the precision of the data types

    cast(Price as decimal(9,7)) Calculation

    The line above CAST's the price (Price) to DECIMAL(9, 7).  Imo this is not a good practice.  When it comes to US dollars there are only 4 places of significance to the right of the decimal point.  Any digits further to the right, i.e. 5th to infinity, doesn't impact rounding to the penny.  The SQL Server MONEY datatype is a weird space saving utility roughly equivalent to FLOAT but with only 4 digit significance.  Imo MONEY datatype is of questionable necessity and has very limited application

    When the Factor column's row value is 4.3020044 and it is applied to $1.0000000 (which is imaginary and cannot exist)  the result is insignificant digits being populated with value.  Reports and various aggregated numbers might not add up to the penny.  In my experience these sorts of discrepancies are frustrating for data users to encounter.  For the SQL developer making the numbers add up exactly and answering questions can be a difficult conversation

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Thanks to all who responded. I'll review and post back.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply