Restart runningtotal when predetermined value is reached

  • Hi, Im trying to create a running total based on FieldToSum that will restart everytime runningTotal <= 25 and count the number of times this happens in runningCount.

    This needs to be done for both forwards and backwards.

    This is currently what i have inherited (made some changes based on articles here already which has helped).

    There are approximately 40 million records this needs to be done for.

    If you run the following code then that will give the answer im after albeit quite slowly.

    Im after any sort of help/advice with this. Also this is my first post so if im missing any information please let me know.

    Current code

    IF OBJECT_ID('TempDB..#runTotal','U') IS NOT NULL

    DROP TABLE #runTotal

    CREATE TABLE #runTotal

    (

    runTotalID INT IDENTITY(1,1),

    tableID int,

    FieldToSum FLOAT,

    runningTotal float,

    runningCount int

    )

    IF OBJECT_ID('TempDB..#table','U') IS NOT NULL

    DROP TABLE #table

    CREATE TABLE #table

    (

    tableID int NOT NULL,

    runningCount int not null,

    orderCol int,

    FieldToSum float

    )

    insert #table

    select1, 0, 1, 4.733415944 union all

    select2, 0, 8, 4.733415944 union all

    select3, 0, 6, 4.733415944 union all

    select4, 0, 19, 4.733415944 union all

    select5, 0, 14, 6.950871733 union all

    select6, 0, 3, 6.942781406 union all

    select7, 0, 2, 6.956403557 union all

    select8, 0, 20, 6.945977457 union all

    select9, 0, 11, 6.956403557 union all

    select10, 0, 13, 6.956403557 union all

    select11, 0, 4, 6.945977457 union all

    select12, 0, 10, 6.956403557 union all

    select13, 0, 17, 6.945977457 union all

    select14, 0, 12, 6.956403557 union all

    select15, 0, 15, 6.968313001 union all

    select16, 0, 5, 6.968313001 union all

    select17, 0, 7, 7.024537923 union all

    select18, 0, 18, 7.024537923 union all

    select19, 0, 16, 6.968313001 union all

    select20, 0, 9, 6.968313001

    CREATE UNIQUE CLUSTERED INDEX [Idx1] ON #runTotal(runTotalID )

    Declare @flag AS int=0

    DECLARE @Counter As INT =1

    declare @runningTotal as float

    declare @anchor int

    WHILE @Flag=0 BEGIN

    IF (select COUNT(1) from #table where runningCount=0)<>0

    BEGIN

    TRUNCATE Table #runTotal

    INSERT INTO #runTotal

    SELECT tableID, FieldToSum, 0, 0 FROM #table where runningCount=0 ORDER BY orderCol ASC

    set @runningTotal = 0

    update #runTotal

    set @runningTotal = runningTotal = @runningtotal + FieldToSum,

    @anchor = runtotalid

    from #runTotal WITH (TABLOCKX) OPTION (MAXDOP 1)

    UPDATE #table

    SET runningCount=@Counter from #table a inner join #runTotal b on a.tableID = b.tableID where b.runningTotal <= 25

    END

    ELSE

    SET @FLAG=1

    IF (select COUNT(1) from #table where runningCount=0)<>0

    BEGIN

    TRUNCATE Table #runTotal

    INSERT INTO #runTotal

    SELECT tableID, FieldToSum, 0, 0 FROM #table where runningCount=0 ORDER BY orderCol DESC

    set @runningTotal = 0

    update #runTotal

    set @runningTotal = runningTotal = @runningtotal + FieldToSum,

    @anchor = runtotalid

    from #runTotal WITH (TABLOCKX) OPTION (MAXDOP 1)

    UPDATE #table

    SET runningCount=@Counter from #table a inner join #runTotal b on a.tableID = b.tableID where b.runningTotal <= 25

    END

    ELSE

    SET @FLAG=1

    SET @Counter=@Counter+1

    END

  • This process uses the 'Quirky Update' without the essential requirements being in place. As such it is unreliable & will almost certainly give incorrect results in the future (or may have already done so).

    Quirky Update is where variable & column are updated together:

    update #runTotal

    set @runningTotal = runningTotal = @runningtotal + FieldToSum

    See this:

    http://www.sqlservercentral.com/articles/T-SQL/68467/

    for more details - the list of requirements for Quirky Update are towards the end.

    It may seem a bit complicated - but unfortunately it is! 🙂

  • I appreciate the help with my broken quirky update. I believed ive fixed that and edited my first post to reflect the new code.

    Is there anything else anyone can recommend that could help ?

  • You could try the following

    I'm sure that this can be improved though and I don't know if it will scale very well as it is recursive.

    IF OBJECT_ID('TempDB..#table','U') IS NOT NULL

    DROP TABLE #table

    CREATE TABLE #table

    (

    tableID int NOT NULL,

    runningCount int not null,

    orderCol int,

    FieldToSum float

    )

    IF OBJECT_ID('TempDB..#table2','U') IS NOT NULL

    DROP TABLE #table2

    CREATE TABLE #table2

    (

    tableID int NOT NULL,

    runningCount int not null,

    orderCol int,

    FieldToSum float

    )

    insert #table

    select1, 0, 1, 4.733415944 union all

    select2, 0, 8, 4.733415944 union all

    select3, 0, 6, 4.733415944 union all

    select4, 0, 19, 4.733415944 union all

    select5, 0, 14, 6.950871733 union all

    select6, 0, 3, 6.942781406 union all

    select7, 0, 2, 6.956403557 union all

    select8, 0, 20, 6.945977457 union all

    select9, 0, 11, 6.956403557 union all

    select10, 0, 13, 6.956403557 union all

    select11, 0, 4, 6.945977457 union all

    select12, 0, 10, 6.956403557 union all

    select13, 0, 17, 6.945977457 union all

    select14, 0, 12, 6.956403557 union all

    select15, 0, 15, 6.968313001 union all

    select16, 0, 5, 6.968313001 union all

    select17, 0, 7, 7.024537923 union all

    select18, 0, 18, 7.024537923 union all

    select19, 0, 16, 6.968313001 union all

    select20, 0, 9, 6.968313001

    CREATE UNIQUE INDEX [Idx1] ON #table(orderCol)

    ;WITH

    maxordercol as (SELECT MAX(orderCol) endorder FROM #table),

    a AS (

    SELECT TableID, orderCol, FieldToSum,

    FieldToSum RunTotal,

    1 runningCount,

    'A' direction

    FROM #table

    WHERE orderCol = 1

    UNION ALL

    SELECT t.TableID, t.orderCol, t.FieldToSum,

    case when t.FieldToSum + a.runTotal <= 25 then

    t.FieldToSum + a.runTotal

    else

    t.FieldToSum

    end RunTotal,

    case when t.FieldToSum + a.runTotal <= 25 then

    a.runningCount

    else

    a.runningCount + 1

    end runningCount,

    'A' direction

    FROM a

    inner join #table t on t.orderCol = a.orderCol + 1

    ),

    b as (

    SELECT TableID, orderCol, FieldToSum,

    FieldToSum RunTotal,

    1 runningCount,

    'D' direction

    FROM #table

    WHERE orderCol = (select endorder from maxordercol)

    UNION ALL

    SELECT t.TableID, t.orderCol, t.FieldToSum,

    case when t.FieldToSum + b.runTotal <= 25 then

    t.FieldToSum + b.runTotal

    else

    t.FieldToSum

    end RunTotal,

    case when t.FieldToSum + b.runTotal <= 25 then

    b.runningCount

    else

    b.runningCount + 1

    end runningCount,

    'D' direction

    FROM b

    inner join #table t on t.orderCol = b.orderCol - 1

    ),

    c AS (

    select tableid, runningCount, ordercol, fieldtosum

    from (

    select *, row_number() OVER (partition by tableid order by runningCount) seq

    from (select * from a union all select * from b) x

    ) y

    where y.seq = 1

    )

    INSERT INTO #table2

    SELECT *

    FROM c

    OPTION (MAXRECURSION 0)

    UPDATE #table

    SET runningCount = b.runningCount

    FROM #table a INNER JOIN #table2 b ON a.tableid = b.tableid

    I tried it over 1200 records as well and the performance was 40ms vs 4200ms

  • Thanks !!

    I will give it a go and let you know how it performs over our dataset.

  • I will try and explain a little better. Im trying to mark each row based on having a something_runtot up to or equal to 25 - it cant go over this number and i am required to use the marked rows elsewhere (this is where runningCount comes in).

    Tables are NOT magnetic tapes than can be run forward and backwards; what dose that mean?

    My terminology was a little off in places 🙂

    It means I need something_runtot up to or equal to 50 and the current business logic requires half from ascending and half from descending. So i need to get a running total up to or equal to 25, order by orderCol ascending then mark the records and i need to do the same for order by orderCol descending.

    So in my current code the first time through the loop it will give the value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol asc and a value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol desc. This needs to be repeated until no more rows are left and no rows are allowed to be used more than once.

    Do you have a true sequence number for the something_values you want to total?

    Yes i do but when i insert the values i use the identity column to hold the ordering. (Im also not too sure why IDENTITY shouldnt be used).

    Can you keep changing it?

    I can change it as much as necessary to get better performance, however id prefer not to change from a float currently as thats what the base table is using.

    I will give your code a go and see what happens - im in a totally different timezone so it will be a while before i can test it.

    As always if my explanations dont make sense please let me know and i will try to rewrite it.

    Thanks for the help.

  • matak (9/26/2012)


    I will try and explain a little better. Im trying to mark each row based on having a something_runtot up to or equal to 25 - it cant go over this number and i am required to use the marked rows elsewhere (this is where runningCount comes in).

    Tables are NOT magnetic tapes than can be run forward and backwards; what dose that mean?

    My terminology was a little off in places 🙂

    It means I need something_runtot up to or equal to 50 and the current business logic requires half from ascending and half from descending. So i need to get a running total up to or equal to 25, order by orderCol ascending then mark the records and i need to do the same for order by orderCol descending.

    So in my current code the first time through the loop it will give the value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol asc and a value of 1 to all rows with a runningTotal up to or equal to 25 ordered by orderCol desc. This needs to be repeated until no more rows are left and no rows are allowed to be used more than once.

    Do you have a true sequence number for the something_values you want to total?

    Yes i do but when i insert the values i use the identity column to hold the ordering. (Im also not too sure why IDENTITY shouldnt be used).

    Can you keep changing it?

    I can change it as much as necessary to get better performance, however id prefer not to change from a float currently as thats what the base table is using.

    I will give your code a go and see what happens - im in a totally different timezone so it will be a while before i can test it.

    As always if my explanations dont make sense please let me know and i will try to rewrite it.

    Thanks for the help.

    Be aware that the SUM.. OVER(ORDER BY ...) sub-clause requires SQL Server 2012

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply