grouped running totals

  • Hi all. Firstly, here's some background and then some test data. I have a table that has lots of staff (described by unique staffids) and hours worked as a metric for each int week number.

    I'm looking for an update statement that'll wizz thru the table and insert running totals for each staff member (in weekno order asc). I can do this for 1 staff member, but am struggling where there are lots of the little devils in there.

    Test data, results and desired resuts follow:

    declare @work table(

    runtot float, staffid int, weekno int, variancehours float )

    --- little table def

    insert @work (runtot, staffid, weekno, variancehours)

    --- insert some test data

    select null,1,1,1.5 union

    select null,1,2,-2 union

    select null,1,3,-1 union

    select null,1,4,5 union

    select null,2,1,4 union

    select null,2,2,-1

    ---

    select * from @work

    --- results would be:

    NULL111.5

    NULL12-2

    NULL13-1

    NULL145

    NULL214

    NULL22-1

    --results I'd like to see post update...

    1.5111.5

    -0.512-2

    -1.513-1

    -2.5145

    4214

    322-1

    Thanks for any help, regards Greg.

  • Check out this article http://www.sqlservercentral.com/articles/T-SQL/68467/ AND the discussion that goes with it.

  • Thanks. I;d read this already but it didn't help - I'd implemented the triangular join type mechanism (I think) to create a running total - but this part I'm stuck on specifically is that I now need to UPDATE what's there, considering seperate running totals for differing staffids. I know i am being dumb, as it'll just need some grouping on staffid, I guess...

  • CELKO (4/20/2012)


    >> I have a table that has lots of staff (described by unique staff_ids) and hours worked as a metric for each int week number. <<

    You have not ever worked with database before and never did any research. I would give you 1-2 years before you have an epiphany and can write good SQL.

    1. Nobody would use integers for staff. There is no validation or verification. How about the SSN?

    2. There is an ISO-8601 for weeks within a year. The format is yyyyWww. We use standards. Bad programmers invent and re-invent.

    3. Why are using FLOAT? Are those hours really keep to laboratory precision? Have you ever seen a time clock that has more than two decimal places? You office has an atomic clock?

    4. Where is the key to table? No constraints, etc. Files and punch cards are like this! This is not SQL yet.

    5. How do you work negative hours, as shown in your sample data?

    6. Why are you storing a running total? SQL people will compute such things and perhaps put it in a VIEW. You are a punch card programmer who has to PHYSICALLY store computations in the card.

    Here is how we would do this in SQL

    CREATE TABLE Timecards

    (staff_id CHAR(9) NOT NULL PRIMARY KEY

    REFERENCES Personnel (staff_id)

    work_week CHAR(10) NOT NULL

    CHECK (work_week LIKE '[12][0-9][0-9][0-9]W[0-5][0-9]'),

    PRIMARY KEY (staff_id, work_week),

    work_hrs DECIMAL (5,2) DEFAULT 0.00 NOT NULL);

    Now we use the window clause on a SUM() aggregate function.

    SELECT staff_id, work_week,

    SUM(work_hrs)

    OVER (PARTITION BY staff_id

    ORDER BY work_week

    ROWS UNBOUNDED PRECEDING BETWEEN CURRENT ROW)

    AS work_hrs_run_tot

    FROM Timecards

    WHERE ..;

    Well Mr. Celko, your solution would work if this were SQL Server 2012 (if I remember coorectly, this is the version where these get expanded), unfortunately your solution will not work in SQL Server 2008R2 and older. How about providing a working solution to a problem.

  • CREATE TABLE Timecards

    (staff_id CHAR(9) NOT NULL PRIMARY KEY

    REFERENCES Personnel (staff_id)

    work_week CHAR(10) NOT NULL

    CHECK (work_week LIKE '[12][0-9][0-9][0-9]W[0-5][0-9]'),

    PRIMARY KEY (staff_id, work_week),

    work_hrs DECIMAL (5,2) DEFAULT 0.00 NOT NULL);

    Aren't there multiple primary keys in this definition or am I missing something?

  • CELKO (4/20/2012)


    >> I have a table that has lots of staff (described by unique staff_ids) and hours worked as a metric for each int week number. <<

    You have not ever worked with database before and never did any research. I would give you 1-2 years before you have an epiphany and can write good SQL.

    1. Nobody would use integers for staff. There is no validation or verification. How about the SSN?

    2. There is an ISO-8601 for weeks within a year. The format is yyyyWww. We use standards. Bad programmers invent and re-invent.

    3. Why are using FLOAT? Are those hours really keep to laboratory precision? Have you ever seen a time clock that has more than two decimal places? You office has an atomic clock?

    4. Where is the key to table? No constraints, etc. Files and punch cards are like this! This is not SQL yet.

    5. How do you work negative hours, as shown in your sample data?

    6. Why are you storing a running total? SQL people will compute such things and perhaps put it in a VIEW. You are a punch card programmer who has to PHYSICALLY store computations in the card.

    Here is how we would do this in SQL

    CREATE TABLE Timecards

    (staff_id CHAR(9) NOT NULL PRIMARY KEY

    REFERENCES Personnel (staff_id)

    work_week CHAR(10) NOT NULL

    CHECK (work_week LIKE '[12][0-9][0-9][0-9]W[0-5][0-9]'),

    PRIMARY KEY (staff_id, work_week),

    work_hrs DECIMAL (5,2) DEFAULT 0.00 NOT NULL);

    Now we use the window clause on a SUM() aggregate function.

    SELECT staff_id, work_week,

    SUM(work_hrs)

    OVER (PARTITION BY staff_id

    ORDER BY work_week

    ROWS UNBOUNDED PRECEDING BETWEEN CURRENT ROW)

    AS work_hrs_run_tot

    FROM Timecards

    WHERE ..;

    Let's see your SQL Server 2008 solution, Joe. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • greg.bull (4/20/2012)


    Thanks. I;d read this already but it didn't help - I'd implemented the triangular join type mechanism (I think) to create a running total - but this part I'm stuck on specifically is that I now need to UPDATE what's there, considering seperate running totals for differing staffids. I know i am being dumb, as it'll just need some grouping on staffid, I guess...

    Actually, the whole purpose of the article was to get people away from RBAR methods and the Triangular Join is a form of "Hidden" RBAR. The article was actually about the Quirky Update method to replace Triangular Joins, etc.

    Here's your solution. As normal, the details are in the comments in the code.

    --===== Conditionally drop the work table to

    -- make reruns in SSMS easier.

    IF OBJECT_ID('Tempdb..#Work','U') IS NOT NULL

    DROP TABLE #Work

    ;

    --===== Create the working table

    CREATE TABLE #Work

    (

    RunTot FLOAT,

    StaffID INT,

    WeekNo INT,

    VarianceHours FLOAT

    )

    ;

    --===== Populate the working table.

    INSERT INTO #Work

    (RunTot,StaffID,WeekNo,VarianceHours)

    SELECT NULL,1,1,1.5 UNION ALL

    SELECT NULL,1,2,-2 UNION ALL

    SELECT NULL,1,3,-1 UNION ALL

    SELECT NULL,1,4,5 UNION ALL

    SELECT NULL,2,1,4 UNION ALL

    SELECT NULL,2,2,-1

    ;

    --===== Add the quintessential Clustered Index to

    -- the columns the Quirky Update will rely on.

    CREATE UNIQUE CLUSTERED INDEX IX_#Work_QU

    ON #Work (StaffID, WeekNo)

    ;

    --===== Declare some obviously name variables for

    -- the Quirky Update to use.

    DECLARE @RunTot FLOAT,

    @PrevStaffID INT,

    @Counter INT

    ;

    SELECT @Counter = 1

    ;

    --===== Do the Running Total using the Quirky Update with a

    -- built in safety mechanism to force an error if the

    -- order is disturbed.

    -- This will do a million rows in just a couple of seconds

    -- which blows the 2012 method out of the water. Still,

    -- the 2012 method is faster than a cursor and it's supported.

    WITH

    cteEnumerate AS

    (

    SELECT Counter = ROW_NUMBER() OVER (ORDER BY StaffID, WeekNo),

    RunTot,StaffID,WeekNo,VarianceHours

    FROM #Work WITH(TABLOCKX)

    )

    UPDATE cteEnumerate

    SET @RunTot = RunTot

    = CASE

    WHEN Counter = @Counter

    THEN

    CASE

    WHEN StaffID = @PrevStaffID

    THEN @RunTot + VarianceHours

    ELSE VarianceHours

    END

    ELSE 1/0 --Force Error

    END,

    @PrevStaffID = StaffID,

    @Counter = @Counter + 1

    FROM cteEnumerate

    OPTION(MAXDOP 1)

    ;

    --===== Show the results.

    SELECT * FROM #Work ORDER BY StaffID, WeekNo

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks to you all for your comments and feedback. Jeff, I understood the purpose of your excellent article - but I'd probably have to use a very simple syntax (that I could understand and support). Will try to fathom out how your solution works and give it a try.

    Regards, Greg.

  • Thanks Lynn, for your direction and advice too ! I'm very new to SQL and the basic syntax is about all I can get my head around at present. As for the table I'm working on, that's just the existing structure which I need to analyse...

  • greg.bull (4/22/2012)


    Thanks to you all for your comments and feedback. Jeff, I understood the purpose of your excellent article - but I'd probably have to use a very simple syntax (that I could understand and support). Will try to fathom out how your solution works and give it a try.

    Regards, Greg.

    Thank you for the feedback.

    The way it works is just exactly the same as if you were to write some C# code to do it all except you that you don't have to write the Read/Write code... UPDATE does that for you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply