Cumulative SUM by HOUR for specific DATE range

  • Hello,

    I have searched a number of forums and tried a few examples but cannot get anything to work. I hope i include enough information to be helpful.

    Please can someone advise how i can accomplish the following:

    ==========

    existing table with format

    RUNID; TIMEFRAME_START; QUEUE_NAME; ATSTART_MAILS; REPLIED_MAILS

    1; 2012-02-10 00:00:00.000; QUEUE1; 25; 15

    1; 2012-02-10 01:00:00.000; QUEUE1; 23; 14

    1; 2012-02-10 02:00:00.000; QUEUE1; 21; 11

    >>>>>

    1; 2012-02-10 22:00:00.000; QUEUE1; 14; 9

    1; 2012-02-10 23:00:00.000; QUEUE1; 19; 17

    2; 2012-02-11 00:00:00.000; QUEUE1; 22; 13

    2; 2012-02-11 01:00:00.000; QUEUE1; 21; 11

    etc, etc, etc

    i need to run some SQL daily to output results for the last day in the following format:

    TIMEFRAME_START; QUEUE_NAME; ATSTART_MAILS; REPLIED_MAILS

    2012-02-10 00:00:00.000; QUEUE1; 25; 15

    2012-02-10 01:00:00.000; QUEUE1; 48; 29

    2012-02-10 02:00:00.000; QUEUE1; 69; 40

    etc, etc, etc

    i have tried using CTE with a <= inner join but keep getting huge results (i narrowed it down to 1 queue for easy analysis of results....and will need to add more columns once the base code is sorted)

    with data as

    (

    select TIMEFRAME_START, QUEUE_NAME, ATSTART_MAILS, REPLIED_MAILS from EGTS_OUT_EMAIL_ACTIVITY_HOURLY

    WHERE RUNID=(SELECT MAX(RUNID) FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY)

    )

    SELECT

    t.TIMEFRAME_START, t.queue_name,

    (SELECT SUM(atstart_mails) FROM data WHERE data.TIMEFRAME_START<=t.TIMEFRAME_START),

    (SELECT sum(replied_mails) FROM data WHERE data.TIMEFRAME_START<=t.TIMEFRAME_START)

    FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY t

    where RUNID=(SELECT MAX(RUNID) FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY)

    and t.QUEUE_NAME='5 UK GOODS'

    group by t.timeframe_start, t.QUEUE_NAME

    order by TIMEFRAME_START asc

    i know the subqueries are bad :p

    i think its something to do with the <= join not using the RUNID clause somehow but not sure as i added it on both sides....

    many thanks

    samuel

  • Could you please provide DDL and insert sample data script? Tips how to do so are in the link at the bottom of my signature.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • ignore me....the code works!!

    im an idiot and forgot to reference properly!!

    thanks anyway :p

    with data as

    (

    select TIMEFRAME_START, QUEUE_NAME, ATSTART_MAILS, REPLIED_MAILS from EGTS_OUT_EMAIL_ACTIVITY_HOURLY

    WHERE RUNID=(SELECT MAX(RUNID) FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY)

    and QUEUE_NAME='5 UK GOODS'

    )

    SELECT

    t.TIMEFRAME_START, t.queue_name,

    (SELECT SUM(atstart_mails) FROM data WHERE data.TIMEFRAME_START<=t.TIMEFRAME_START),

    (SELECT sum(replied_mails) FROM data WHERE data.TIMEFRAME_START<=t.TIMEFRAME_START)

    FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY t

    where t.RUNID=(SELECT MAX(RUNID) FROM EGTS_OUT_EMAIL_ACTIVITY_HOURLY)

    and t.QUEUE_NAME='5 UK GOODS'

    group by t.timeframe_start, t.QUEUE_NAME

    order by t.TIMEFRAME_START asc

  • You might wnat to revisit the code, I would suggest reading this article http://www.sqlservercentral.com/articles/T-SQL/61539/ and then look at a turning this into a Quirky update for the Running total, something like this http://www.sqlservercentral.com/articles/T-SQL/68467/

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Jason-299789 (2/11/2013)


    You might wnat to revisit the code, I would suggest reading this article http://www.sqlservercentral.com/articles/T-SQL/61539/ and then look at a turning this into a Quirky update for the Running total, something like this http://www.sqlservercentral.com/articles/T-SQL/68467/

    The OP's query uses triangular joins, usually a no-no because of the potential for huge working sets. However, if partitions are small, it can perform reasonably well. The easiest way to calculate running totals is with a recursive CTE as follows:

    ;WITH SampleData (RUNID, TIMEFRAME_START, QUEUE_NAME, ATSTART_MAILS, REPLIED_MAILS)AS (

    SELECT 1, '2012-02-10 00:00:00.000', 'QUEUE1', 25, 15 UNION ALL

    SELECT 1, '2012-02-10 01:00:00.000', 'QUEUE1', 23, 14 UNION ALL

    SELECT 1, '2012-02-10 02:00:00.000', 'QUEUE1', 21, 11 UNION ALL

    SELECT 1, '2012-02-10 22:00:00.000', 'QUEUE1', 14, 9 UNION ALL

    SELECT 1, '2012-02-10 23:00:00.000', 'QUEUE1', 19, 17 UNION ALL

    SELECT 2, '2012-02-11 00:00:00.000', 'QUEUE1', 22, 13 UNION ALL

    SELECT 2, '2012-02-11 01:00:00.000', 'QUEUE1', 21, 11 )

    , SequencedData AS (

    SELECT

    Seq = ROW_NUMBER() OVER(ORDER BY RUNID, TIMEFRAME_START),

    *

    FROM SampleData

    ),

    Calculator AS (

    SELECT Seq, RUNID, TIMEFRAME_START, QUEUE_NAME, ATSTART_MAILS, REPLIED_MAILS

    FROM SequencedData

    WHERE Seq = 1

    UNION ALL

    SELECT nr.Seq, nr.RUNID, nr.TIMEFRAME_START, nr.QUEUE_NAME,

    ATSTART_MAILS = CASE WHEN nr.RUNID = lr.RUNID THEN nr.ATSTART_MAILS+lr.ATSTART_MAILS ELSE nr.ATSTART_MAILS END,

    REPLIED_MAILS = CASE WHEN nr.RUNID = lr.RUNID THEN nr.REPLIED_MAILS+lr.REPLIED_MAILS ELSE nr.REPLIED_MAILS END

    FROM Calculator lr -- last row

    INNER JOIN SequencedData nr -- new row

    ON nr.Seq = lr.Seq+1

    )

    SELECT *

    FROM Calculator

    The fastest way to calculate running totals is the QU method referenced by Jason.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ive got over 500 queues and 15 extra data columns to compute also....would those links scale up??

    ive got this running in under 20 seconds....ill look into those other methods

    many thanks

  • Chris,

    I've never seen the recursive CTE method before, and looks like a viable option to the quirky update to a point.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • lilywhites (2/11/2013)


    ive got over 500 queues and 15 extra data columns to compute also....would those links scale up??

    ive got this running in under 20 seconds....ill look into those other methods

    many thanks

    The QU method scales well and will take about 5 seconds to UPDATE a million rows (on what? Most folks' testing rig). The rCTE method also scales well and takes about 25 seconds to SELECT a million rows (plus network time). There are a number of other methods, all of which are slower. The triangular join method you are using usually scales particularly badly, because as the set size increases, so does the number of rows to be aggregated. In your case the data is partitioned which helps.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Jason-299789 (2/11/2013)


    Chris,

    I've never seen the recursive CTE method before, and looks like a viable option to the quirky update to a point.

    Hi Jason, it's been subjected to a significant amount of intermittent testing here on ssc, have a Goooogle.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ill check out the rCTE method for the final phase as there may well be expansion in the data set in the future....as a proof of concept this will have to do for now 🙁

    thanks for your help

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply