Running total with reset condition

  • hi all,

    i would like a running total with a restart after a certain limit.  Let set the condition to be 30, so every 30 running sum will add to a group


    Cat Value Running_total 
    101 10 10
    102 20 30
    103 30 30
    104 12 12
    105 18 30
    106 10 10
    107 10 20 

    becomes


    Category Value Running_total GroupID
    101 10 10 1
    102 20 30 1
    103 30 30 2
    104 12 12 3
    105 18 19 3
    106 10 10 4
    107 10 20 4

    I am using the following for RunningTotal it restarts the counter but adds the remaining onto previous and doesnt restart with exact value (assuming 105 has value of 20)
    sum([Value]) over (order by [Category]  rows unbounded preceding ) % 30 as Running_Total 
    this results in:

    Category Value Running_total 
    101 10 10
    102 20 30
    103 30 30
    104 12 12
    105 20 2
    106 10 12
    107 10 27

    Many thanks

    DECLARE @T TABLE (Category VARCHAR(5), Value INT)
    INSERT INTO @T VALUES  
    ('101', 10), 
    ('102', 20), 
    ('103', 30), 
    ('104', 12), 
    ('105', 19), 
    ('106', 10), 
    ('107', 10)

  • How are you getting a running total when 104, 105, etc don't add to the total? Do you mean after a reset you stop?

  • Say i can only fit max of 30 values on 1 page.
    101, 102 = page 1
    103 = page 2
    104, 105 = page 3
    106, 107 = page 4

    the running total value resets when it reaches 30, then restarts at zero for next line and checks if running total is <=30 and assigns page.  so what i really want as output is


    Category Value Running_total Page
    101 10 10 1
    102 20 30 1 
    103 30 30 2
    104 12 12 3
    105 18 30 3
    106 10 1 4
    107 10 17 4

  • sorry, previous post 🙂

    Just found an "unfinished" post (Link)...this is exactly it...but cant seem to finish it off

  • What happens if the sum is not  exactly 30? Would you need to divide it? or should you move the whole row to a new page?

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • just move onto next row, which would reset and start a new running total.


    Category Value Running_total Page
    101 10 10 1
    102 20 30 1
    103 30 30 2
    104 12 12 3
    105 20 20 4
    106 5 25 4
    107 10 10 5

    above i change rows 105 onwards to explain your statement.
    at 104 the running total is 12, 105 value is 20, which goes over 30, so assign 104 with 3.
    at 105 the running total is 20, plus, 106 value of 5, is 25, 107 value of 10 goes over to 35.  so assign 105 and 106 with groupid 4
    and so on

  • K, rolled into a proc since it's easier to test. Put your data into a table called RTSource. This lets me add other data and play.

    Proc:
    CREATE OR ALTER PROCEDURE RunningTotalQueries
    AS
    WITH lagCTE
    AS (SELECT
       Category,
       SomeValue,
       LagValue1 = LAG(SomeValue, 1, 0) OVER (ORDER BY Category),
       LagValue2 = LAG(SomeValue, 2, 0) OVER (ORDER BY Category)
      FROM RTSource
     )
    SELECT
      lagCTE.Category,
      lagCTE.SomeValue,
      SUms = CASE
          WHEN lagCTE.SomeValue + lagCTE.LagValue1 > 30 THEN
           lagCTE.SomeValue
          ELSE
           lagCTE.SomeValue + lagCTE.LagValue1
        END
    FROM lagCTE;
    GO

    Test. Uses tSQLt. If I wanted to check other data, I'd either add it into the test or I'd create duplicate tests.


    EXEC tSQLt.NewTestClass @ClassName = N'tTSQLTests';
    GO

    CREATE PROCEDURE tTSQLTests.[test running total reset]
    AS
    -----------------------------------
    ------- Assemble
    -----------------------------------
    EXEC tsqlt.FakeTable
      @TableName = N'RTSource'

    INSERT RTSource
    VALUES ('101', 10),
       ('102', 20),
       ('103', 30),
       ('104', 12),
       ('105', 19),
       ('106', 10),
       ('107', 10);

    CREATE TABLE tTSQLTests.Expected
    ( Category  VARCHAR(5),
      SomeValue  INT,
      RunningTotal INT
    );
    INSERT INTO tTSQLTests.Expected
    VALUES
      ('101', 10, 10),
      ('102', 20, 30),
      ('103', 30, 30),
      ('104', 12, 12),
      ('105', 19, 19),
      ('106', 10, 29),
      ('107', 10, 10);
    SELECT
      Category,
      SomeValue,
      RunningTotal
    INTO tTSQLTests.Actual
    FROM tTSQLTests.Expected
    WHERE 1 = 0;

    -----------------------------------
    ------- Act
    -----------------------------------
    INSERT tTSQLTests.Actual EXEC RunningTotalQueries

    -----------------------------------
    ------- Assert
    -----------------------------------
    EXEC tsqlt.AssertEqualsTable
      @Expected = N'tTSQLTests.Expected',
      @Actual = N'tTSQLTests.Actual',
      @Message = N'incorrect query'
    GO

  • Talvin Singh - Monday, September 11, 2017 11:47 AM

    just move onto next row, which would reset and start a new running total.


    Category Value Running_total Page
    101 10 10 1
    102 20 30 1
    103 30 30 2
    104 12 12 3
    105 20 20 4
    106 5 25 4
    107 10 10 5

    above i change rows 105 onwards to explain your statement.
    at 104 the running total is 12, 105 value is 20, which goes over 30, so assign 104 with 3.
    at 105 the running total is 20, plus, 106 value of 5, is 25, 107 value of 10 goes over to 35.  so assign 105 and 106 with groupid 4
    and so on

    One final question.  If you're limit is 30 and you have a row that contains something larger on that one row (Value = 92, for example), what then?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I will give this a go!
    The large dataset I have, a single row is not higher than the condition. I will be testing this on a larger dataset that has a over 50k rows, a max value of 30, but a running total max 500.
    Do you think this will work with the above code?

  • With the proc procedure, is there a way to create a column that assigns a group id?
    For example, for those values that reach the limit of 30,you assign an id, so in our example the first 2 rows will be group id 1, see below;

    Category Value Running_total groupId 
    101 10 10 1
    102 20 30 1
    103 30 30 2
    104 12 12 3
    105 20 20 4
    106 5 25 4
    107 10 10 5

  • Are there other values in the set? You don't really have anything that creates a group, other than the sum, and for that, you have a variable set of values.

  • Here's a version that works in all versions of SQL Server from 2005 and up.  And, yes... it calculates the GroupID in a fashion that Steve identified above.

    First, the test data... read the comment... it's critical.


    --===== The table MUST have a UNIQUE CLUSTERED INDEX
         -- on the Category column AND the Category column
         -- MUST preserve the order that you want the data
         -- to appear in (might be tough with VARCHAR()).
     CREATE TABLE #TestTable
            (
             Category       VARCHAR(5)  NOT NULL PRIMARY KEY CLUSTERED
            ,Value          INT         NOT NULL
            ,RunningTotal   INT
            ,GroupID        INT
            )
    ;
     INSERT INTO #TestTable
            (Category, Value)
     SELECT '101', 10 UNION ALL
     SELECT '102', 20 UNION ALL
     SELECT '103', 30 UNION ALL
     SELECT '104', 12 UNION ALL
     SELECT '105', 20 UNION ALL
     SELECT '106', 5  UNION ALL
     SELECT '107', 10
    ;

    After that, we do what is affectionately known as the "Quirky Update".  It works in all versions of SQL from 2005 and up.  It's also nasty fast and will process a million rows in just a couple of seconds.  Do read the comments.  They're important.  So is the order of processing.  The GroupID MUST be calculated before the RunningTotal changes and the "Safety" must calculated last (or at least after the CASE that does the safety check).


    --===== Declare and preset the support variables.
         -- Did it this way so that it works for all versions 2005 and up.
         -- The variables have obvious names as to what they're for.
    DECLARE  @RunningTotal  INT
            ,@GroupID       INT
            ,@Safety        INT
            ,@MaxValue      INT
    ;
     SELECT  @RunningTotal  = 0
            ,@GroupID       = 1
            ,@Safety        = 1
            ,@MaxValue      = 30
    ;
    --===== This "Quirky Update" works very similar to how you
         -- would solve the problem with a loop except it uses
         -- the behind the scenes natural loops (pseudo-cursor)
         -- of both the SELECT and the UPDATE.
       WITH cteQU AS
    (--==== This cte adds a row number as a "safety counter"
         -- that we'll use to ensure that the proper order
         -- of processing occurred.  It also forces the order
         -- so that the safety check is also the order guarantee.
     SELECT  Safety  = ROW_NUMBER() OVER (ORDER BY Category)
            ,Value
            ,RunningTotal
            ,GroupID
       FROM #TestTable
    ) --=== This is just like any managed code except it uses 3-part updates
     UPDATE qu
        SET  @GroupID      = GroupID      = CASE
                                            WHEN @RunningTotal + Value <= @MaxValue
                                            THEN @GroupID
                                            ELSE @GroupID +1       
                                            END
            ,@RunningTotal = RunningTotal = CASE
                                            WHEN Safety = @Safety
                                            THEN
                                                CASE
                                                WHEN @RunningTotal + Value <= @MaxValue
                                                THEN @RunningTotal + Value
                                                ELSE Value       
                                                END
                                            ELSE 'Process Order Failed.' +@Safety/0
                                            END
            ,@Safety       = @Safety + 1
       FROM cteQU qu WITH (TABLOCKX, INDEX(1)) --Must have these hints
     OPTION (MAXDOP 1)                         --Must guarantee no parallelism
    ;
     SELECT * FROM #TestTable
    ;

    The old "Black Arts" of T-SQL rule. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • LOL, nice, Jeff. I was hoping to avoid some variable in there, but that looks nice.

  • Steve Jones - SSC Editor - Tuesday, September 12, 2017 8:05 AM

    LOL, nice, Jeff. I was hoping to avoid some variable in there, but that looks nice.

    🙂😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This works very well!
    i was able to test it on a large dataset, works blindingly fast!
    really appreciate this

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply