Get running total in the following query

  • DECLARE @CumCredit TABLE

    (RowId INT, CustomerId INT, TransId INT, Pts INT)

    INSERT INTO @CumCredit (Rowid, CustomerID, TransId, pts)

    select 1,123,121,10

    union

    select 2,123,131,20

    union

    select 3,123,141,15

    select * from @CumCredit

    select a.*, (select sum(Pts) from @CumCredit where rowid<=a.rowid) as RunningTotal

    from @CumCredit a

  • beeramgopi (7/25/2012)


    DECLARE @CumCredit TABLE

    (RowId INT, CustomerId INT, TransId INT, Pts INT)

    INSERT INTO @CumCredit (Rowid, CustomerID, TransId, pts)

    select 1,123,121,10

    union

    select 2,123,131,20

    union

    select 3,123,141,15

    select * from @CumCredit

    select a.*, (select sum(Pts) from @CumCredit where rowid<=a.rowid) as RunningTotal

    from @CumCredit a

    This Triangular Join works, but performance is usually lowest of the known TSQL methods.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • beeramgopi (7/25/2012)


    DECLARE @CumCredit TABLE

    (RowId INT, CustomerId INT, TransId INT, Pts INT)

    INSERT INTO @CumCredit (Rowid, CustomerID, TransId, pts)

    select 1,123,121,10

    union

    select 2,123,131,20

    union

    select 3,123,141,15

    select * from @CumCredit

    select a.*, (select sum(Pts) from @CumCredit where rowid<=a.rowid) as RunningTotal

    from @CumCredit a

    Try that on a big table 😉

    Read this article[/url] on why triangular joins are bad


    --edit--

    ChrisM@Work (7/25/2012)


    This Triangular Join works, but performance is usually lowest of the known TSQL methods.

    Ah, beat me to it. I've been on holiday, so I'm slowly having to learn to type again. That's my excuse for slow typing and I'm sticking to it 😀


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Gosh, that's quite confrontational.

    My sincere apologies if I came off as confrontational; that was not my intent.

    I, too, get quite a bit of information on the Internet from this site as well as stuff written by a couple of the names you mentioned.

    The point I was attempting to make (and concede that I did a poor job at doing so) is that: a lot of people say a lot of things about SQL. Until it is tested and/or proven it is, IMHO, it’s just something someone said on the Internet. Perhaps I should have asked this instead: “Is this something you tested? If so do you have any example code?” That was what I was looking for. I personally learn more when someone includes examples. In short, what I was looking for was your last post. Again, I am sorry if that came off as confrontational. I was not even disagreeing with you, more like playing devil’s advocate (e.g. the comment of CLR’s).

    I have tested a few different methods for creating a running total. I am not married to any coding technique and am always looking for a faster way to get result sets. My findings have been that the rCTE is the fastest. That was true before this forum string and has been my findings since.

    Below is the code I used to test the “quirky update” vs a rCTE. This is not an apples-to-apples comparison between what the original poster asked and what I am doing here. What I am about to show is why I believe an rCTE is the fastest means to do a running total. I originally began testing different techniques for loading a running total table into a data warehouse a week or so ago. I was creating a new table. The method we were going to use was a “INSERT INTO SalesRunningTotal SELECT * FROM RunningTotalQuery” technique. To determine which method was the fastest I several techniques for producing the fastest result set. Below is the query I used to produce the test data followed by 3 best performing queries. My findings were (fastest to slowest): rCTE, QU, Fast_Forward cursor, numerous other techniques not included cause the were slow.

    SET NOCOUNT ON;

    --First: Create Test Data

    BEGIN

    IF object_id('tempdb..##Sales') IS NOT NULL

    DROP TABLE ##Sales;

    CREATE TABLE ##Sales

    (

    [Day] int,

    Sales money,

    PRIMARY KEY([Day])

    );

    DECLARE @Day INT = 1, @Sales MONEY, @rand1 INT, @rand2 INT

    -- Put your row count here...

    WHILE @Day <= 100000

    BEGIN

    SELECT@rand1 = ((RAND()*800))+500,

    @rand2 = ((RAND()*800)*-1)+200

    SET @Sales = @rand1+@rand2;

    INSERT ##Sales VALUES (@Day,@Sales);

    SET @Day = @Day + 1

    END;

    CREATE NONCLUSTERED INDEX idx_X

    ON ##Sales ([Day], Sales)

    END;

    GO

    /**************************************************

    Cursor Solution (4 sec/100K rows; 44 Sec/1M rows)

    **************************************************/

    BEGIN

    DECLARE @Day int, @Sales money

    DECLARE @RunningTotal money = 0

    DECLARE @SalesTbl_2 TABLE

    (

    [Day]int,

    Salesmoney,

    RunningTotalmoney,

    PRIMARY KEY([Day])

    )

    DECLARE rt_cursor CURSOR FAST_FORWARD FOR

    SELECT [Day], Sales

    FROM ##Sales

    ORDER BY [Day]

    OPEN rt_cursor

    FETCH NEXT

    FROM rt_cursor

    INTO @Day,@Sales

    WHILE @@FETCH_STATUS = 0

    BEGIN

    SET @RunningTotal = @RunningTotal + @Sales

    INSERT @SalesTbl_2

    VALUES (@Day,@Sales,@RunningTotal)

    FETCH NEXT FROM rt_cursor INTO @Day,@Sales

    END

    CLOSE rt_cursor

    DEALLOCATE rt_cursor

    SELECT [Day],Sales,RunningTotal FROM @SalesTbl_2

    END

    GO

    /**************************************************

    QU Method (2 sec/100K rows; 24sec/1M rows)

    **************************************************/

    SET STATISTICS IO, TIME ON

    BEGIN

    IF OBJECT_ID('Tempdb..#sales_out') IS NOT NULL

    DROP TABLE #sales_out;

    CREATE TABLE #sales_out

    (

    [Day#]int,

    [Sales#]MONEY,

    [RunningTotal]MONEY,

    PRIMARY KEY([Day#] ASC)

    );

    DECLARE @PrevDay INT, @RunningTotal MONEY = 0

    INSERT INTO #sales_out ([Day#],[Sales#])

    SELECT * FROM ##Sales;

    UPDATE #sales_out

    SET @RunningTotal = RunningTotal =CASE

    WHEN [Day] = @PrevDay

    THEN @RunningTotal+[Sales#]

    ELSE Sales#

    END,

    @PrevDay = [Day]

    FROM ##Sales WITH (TABLOCKX)

    OPTION (MAXDOP 1)

    SELECT [Day#],[Sales#],[RunningTotal] FROM #sales_out

    END

    SET STATISTICS IO, TIME OFF

    GO

    /**************************************************

    rCTE Solution (1 sec/100K rows; 17 Sec/1M rows)

    **************************************************/

    SET STATISTICS IO, TIME ON

    DECLARE @sales_out TABLE

    (

    [Day#]INT,

    [Sales#]MONEY,

    [RunningTotal]MONEY,

    PRIMARY KEY([Day#] ASC)

    );

    DECLARE @PrevDay INT, @RunningTotal MONEY = 0;

    WITH CTE ([Day], [Sales], [Running Total])

    AS

    (

    SELECT[Day],

    [Sales],

    [Sales]

    FROM ##Sales

    WHERE [Day] = 1

    UNION ALL

    SELECTa.[Day],

    a.[Sales],

    CTE.[Running Total] + a.[Sales]

    FROM CTE

    JOIN ##Sales a ON CTE.[Day] + 1 = a.[Day]

    )

    SELECT * FROM CTE

    OPTION (MAXRECURSION 0)

    GO

    Thoughts?

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Lynn Pettis (7/23/2012)


    Okay, now modify the rCTE to run against this dataset. It will contain 1,000,000 rows of data and potentially 10,000 unique cID's, not 1 (cID = 123) as in the other test data set.

    CREATE TABLE #sampleData

    (

    RowID int,

    cID int,

    TransID int,

    pts int,

    PRIMARY KEY NONCLUSTERED([RowID])

    );

    declare @TransID int = 1;

    with

    e1(n) as (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),

    e2(n) as (select 1 from e1 a cross join e1 b),

    e4(n) as (select 1 from e2 a cross join e2 b),

    e6(n) as (select row_number() over (order by (select null)) from e4 a cross join e2 b)

    insert into #sampleData (RowID, cID, TransID, pts)

    select

    n,

    ceiling(rand(checksum(newid())) * 10000),

    @TransID + (5 * n),

    ceiling(rand(checksum(newid())) * 100)

    from

    e6;

    create clustered index IX_CustTran on #sampleData (

    cID,

    TransID);

    Very nice!

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Thoughts? It all depends on where you run the code. On my development VM the Quircky Update beats your rCTE every time. Using your code the QU runs in about 1100 ms and your rCTE in about 1600 ms.

  • Lynn Pettis (7/25/2012)


    Thoughts? It all depends on where you run the code. On my development VM the Quircky Update beats your rCTE every time. Using your code the QU runs in about 1100 ms and your rCTE in about 1600 ms.

    That is very interesting - I need to test this some more. The "Quirky Update" is new to me; I don't recall using it in the past. I read up on and tested for the first time this week. I was impressed and am sure I will be using it in the future.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • XMLSQLNinja (7/25/2012)


    Gosh, that's quite confrontational.

    My sincere apologies if I came off as confrontational; that was not my intent. ...

    No offence taken Alan, and thanks for encouraging these tests to be performed.

    There's an error or two in the code you posted for the QU method, causing it to run for longer than necessary and generating incorrect results. An amended version of your comparison script including corrections to the QU method is posted below. I've changed the point at which the timer starts and ends to exclude table creation,where relevant, and also restricted the read of the result set to a single row to reduce the time it takes to write out the results to the client.

    Having done this, and running against a million-row sample table, the QU comes in at around 2 seconds, the rCTE at about 19 seconds and the cursor at about 35 seconds. All three methods return the same result for day = 999991.

    SET NOCOUNT ON;

    --First: Create Test Data

    BEGIN

    IF object_id('tempdb..##Sales') IS NOT NULL

    DROP TABLE ##Sales;

    CREATE TABLE ##Sales

    (

    [Day] int,

    Sales money,

    PRIMARY KEY([Day])

    );

    DECLARE @Day INT = 1, @Sales MONEY, @rand1 INT, @rand2 INT

    -- Put your row count here...

    WHILE @Day <= 1000000

    BEGIN

    SELECT@rand1 = ((RAND()*800))+500,

    @rand2 = ((RAND()*800)*-1)+200

    SET @Sales = @rand1+@rand2;

    INSERT ##Sales VALUES (@Day,@Sales);

    SET @Day = @Day + 1

    END;

    CREATE NONCLUSTERED INDEX idx_X

    ON ##Sales ([Day], Sales)

    END;

    GO

    /**************************************************

    Cursor Solution (4 sec/100K rows; 44 Sec/1M rows) /// 35S PER M ROWS

    **************************************************/

    SET STATISTICS IO, TIME OFF

    BEGIN

    DECLARE @Day int, @Sales money

    DECLARE @RunningTotal money = 0

    DECLARE @SalesTbl_2 TABLE

    (

    [Day]int,

    Salesmoney,

    RunningTotalmoney,

    PRIMARY KEY([Day])

    )

    DECLARE rt_cursor CURSOR FAST_FORWARD FOR

    SELECT [Day], Sales

    FROM ##Sales

    ORDER BY [Day]

    OPEN rt_cursor

    FETCH NEXT

    FROM rt_cursor

    INTO @Day,@Sales

    WHILE @@FETCH_STATUS = 0

    BEGIN

    SET @RunningTotal = @RunningTotal + @Sales

    INSERT @SalesTbl_2

    VALUES (@Day,@Sales,@RunningTotal)

    FETCH NEXT FROM rt_cursor INTO @Day,@Sales

    END

    CLOSE rt_cursor

    DEALLOCATE rt_cursor

    SELECT [Day],Sales,RunningTotal FROM @SalesTbl_2 WHERE [Day] = 999991

    END

    GO

    /**************************************************

    QU Method (2 sec/100K rows; 24sec/1M rows) /// About 2 seconds per million rows

    **************************************************/

    BEGIN

    IF OBJECT_ID('Tempdb..#sales_out') IS NOT NULL

    DROP TABLE #sales_out;

    CREATE TABLE #sales_out

    (

    [Day]int,

    [Sales]MONEY,

    [RunningTotal]MONEY

    );

    INSERT INTO #sales_out ([Day],[Sales])

    SELECT * FROM ##Sales ORDER BY [Day];

    CREATE UNIQUE CLUSTERED INDEX ucx_Sales ON #sales_out ([Day]);

    SET STATISTICS IO, TIME ON

    DECLARE @PrevDay INT = 0, @RunningTotal MONEY = 0

    UPDATE #sales_out

    SET @RunningTotal = RunningTotal =@RunningTotal+[Sales],

    @PrevDay = [Day]

    FROM #sales_out WITH (TABLOCKX)

    OPTION (MAXDOP 1)

    SELECT [Day],[Sales],[RunningTotal] FROM #sales_out WHERE [Day] = 999991

    SET STATISTICS IO, TIME OFF

    END

    GO

    /**************************************************

    rCTE Solution (1 sec/100K rows; 17 Sec/1M rows) /// 19S PER 1M ROWS

    **************************************************/

    SET STATISTICS IO, TIME ON

    --DECLARE @sales_out TABLE

    --(

    --[Day#]INT,

    --[Sales#]MONEY,

    --[RunningTotal]MONEY,

    --PRIMARY KEY([Day#] ASC)

    --);

    DECLARE @PrevDay INT, @RunningTotal MONEY = 0;

    WITH CTE ([Day], [Sales], [Running Total])

    AS

    (

    SELECT[Day],

    [Sales],

    [Sales]

    FROM ##Sales

    WHERE [Day] = 1

    UNION ALL

    SELECTa.[Day],

    a.[Sales],

    CTE.[Running Total] + a.[Sales]

    FROM CTE

    JOIN ##Sales a ON CTE.[Day] + 1 = a.[Day]

    )

    SELECT * FROM CTE WHERE [Day] = 999991

    OPTION (MAXRECURSION 0)

    SET STATISTICS IO, TIME Off

    GO


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • XMLSQLNinja (7/25/2012)


    Lynn Pettis (7/25/2012)


    Thoughts? It all depends on where you run the code. On my development VM the Quircky Update beats your rCTE every time. Using your code the QU runs in about 1100 ms and your rCTE in about 1600 ms.

    That is very interesting - I need to test this some more. The "Quirky Update" is new to me; I don't recall using it in the past. I read up on and tested for the first time this week. I was impressed and am sure I will be using it in the future.

    It IS called the "QUIRKY" update for a reason. There are certain rules which need to be followed. Paul White and Tom Thompson came up with a built in safety check for it which is identified at the beginning of the article.

    In light of the new "previous row" capabilites of 2012, I may not update the article as promised at the beginning of the article.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply