Blog Post

SQL Grouping on Sums (with testing)

,

I ran across a post recently that I thought was an interesting T-SQL problem. The user wanted to group values into a running total, but the groups would reset based on a sum.

In this case, the user had this set of data:

2017-09-14 18_09_39-SQLQuery1.sql - (local)_SQL2016.sandbox (PLATO_Steve (72))_ - Microsoft SQL Serv

Their goal was to run through these values, in Category order, and whenever the running total sum of SomeValue exceeded 30, reset the sum. Their requirement was that this could only be a single value or two values, which boxes in the problem nicely. In other words, they wanted these results:

2017-09-14 18_12_02-SQLQuery1.sql - (local)_SQL2016.sandbox (PLATO_Steve (72))_ - Microsoft SQL Serv

The first two rows equal 30, so we reset for the third row. The third row is 30, so we reset for the fourth. The fourth and fifth would exceed 30, so each gets a reset. Five and six give 29, so we stop there.

I don’t know what the use case is here, but it’s an interesting problem.

My Solution

I had a quick solution using Lag. I created a quick query that looked back 1 and 2 rows. I could have stopped with one, but originally I thought that the poster might go to three rows if the 30 value wasn’t met. I use a CTE to get the current row and previous values, then  a simple CASE to sum values or return the current row.

WITH lagCTE
AS (SELECT
          Category,
          SomeValue,
          LagValue1 = LAG(SomeValue, 1, 0) OVER (ORDER BY Category),
          LagValue2 = LAG(SomeValue, 2, 0) OVER (ORDER BY Category)
     FROM Source
    )
SELECT
      lagCTE.Category,
      lagCTE.SomeValue,
      Sums = CASE
                 WHEN lagCTE.SomeValue + lagCTE.LagValue1 > 30 THEN
                     lagCTE.SomeValue
                 ELSE
                     lagCTE.SomeValue + lagCTE.LagValue1
             END
FROM lagCTE;

I also created a test, because, why do the math. Once I’ve done this, I want to ensure any code changes, any logic changes will still pass the same test. Here’s my test code:

EXEC tsqlt.NewTestClass @ClassName = N'tTSQLTests'
GO
CREATE PROCEDURE tTSQLTests.[test running total reset]
AS
-----------------------------------
-------   Assemble
-----------------------------------
EXEC tsqlt.FakeTable
     @TableName = N'RTSource'
INSERT RTSource
VALUES ('101', 10),
        ('102', 20),
        ('103', 30),
        ('104', 12),
        ('105', 19),
        ('106', 10),
        ('107', 10);
CREATE TABLE tTSQLTests.Expected
(   Category     VARCHAR(5),
     SomeValue    INT,
     RunningTotal INT
);
INSERT INTO tTSQLTests.Expected
VALUES
       ('101', 10, 10),
       ('102', 20, 30),
       ('103', 30, 30),
       ('104', 12, 12),
       ('105', 19, 19),
       ('106', 10, 29),
       ('107', 10, 10);
SELECT
       Category,
       SomeValue,
       RunningTotal
INTO  tTSQLTests.Actual
FROM  tTSQLTests.Expected
WHERE 1 = 0;
-----------------------------------
-------   Act
-----------------------------------
INSERT tTSQLTests.Actual EXEC RunningTotalQueries
-----------------------------------
-------   Assert
-----------------------------------
EXEC tsqlt.AssertEqualsTable
     @Expected = N'tTSQLTests.Expected',
     @Actual = N'tTSQLTests.Actual',
     @Message = N'incorrect query'
GO

Adding Counters

The poster then asked for a group counter, which becomes much harder. I was about to try for another CTE that would give me some counter I could work with when Jeff Moden used the quirky update to build a better script. You can read his code here.

Filed under: Blog Tagged: syndicated, testing, tsql, tsqlt

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating