Getting the difference from current row and previous row

  • Hi,

    I have a database table which logs hourly value from a meter. I need to get the difference between the current row value and the previous row value to get the actual hourly value.

    for example

    timestamp value

    2012-01-16 00:00:00 2345

    2012-01-16 01:00:00 2450

    2012-01-16 02:00:00 2540

    Need to get

    hour value

    2012-01-16 00:00:00 105

    2012-01-16 01:00:00 90

    Currently I am using a SQL statement as below

    SELECT DATEADD(hour,-1,temp1.Timestamp) AS hour,(temp1.Data-temp2.Data) AS hourData

    FROM temp AS temp1 JOIN temp AS temp2

    ON DATEADD(hour,-1,temp1.Timestamp)= temp2.Timestamp

    However temp table has about 12,000 rows.

    There for executing the query it take about 90 seconds.

    How can I do my query in less time.

    Thanks in advanced.

  • Your issue is probably due to your index.

    Try this replication script: -

    BEGIN TRAN

    SET NOCOUNT ON

    IF object_id('tempdb..#testEnvironment') IS NOT NULL

    BEGIN

    DROP TABLE #testEnvironment

    END

    --1,000,000 Random rows of data

    ;WITH CTE AS (

    SELECT DATEADD(HOUR,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),'2000-01-01') AS [timestamp]

    FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3)

    SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID, [timestamp],

    (ABS(CHECKSUM(NEWID())) % 200) + 1 AS data

    INTO #testEnvironment

    FROM CTE

    PRINT '========== BASELINE =========='

    SET STATISTICS TIME ON

    SELECT COUNT(*) FROM #testEnvironment

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    PRINT '========== YOUR QUERY =========='

    SET STATISTICS TIME ON

    SELECT DATEADD(hour,-1,temp1.[timestamp]) AS [hour],(temp1.data-temp2.data) AS hourData

    FROM #testEnvironment temp1

    JOIN #testEnvironment temp2 ON DATEADD(hour,-1,temp1.[timestamp])= temp2.[timestamp]

    SET STATISTICS TIME OFF

    PRINT REPLICATE('=',80)

    ROLLBACK

    So that's using your query on 1 million rows, not just 12,000.

    Here are the times on my box

    ========== BASELINE ==========

    SQL Server Execution Times:

    CPU time = 94 ms, elapsed time = 94 ms.

    ================================================================================

    ========== YOUR QUERY ==========

    SQL Server Execution Times:

    CPU time = 2545 ms, elapsed time = 4717 ms.

    ================================================================================

    So nearly 5 seconds on 1 million rows of data.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • hemaljoes (1/16/2012)


    However temp table has about 12,000 rows.

    There for executing the query it take about 90 seconds.

    How can I do my query in less time.

    Are you able to provide us with an execution plan? Run the query with Actual Execution Plan on in SSMS, right click the graphical plan after execution finishes, save to a file, and attach it to your next post. The problem should be quite clear from that. If you are able to script the table definition and current indexes, that would be helpful too.

  • How does this work for you?

    DECLARE @TestTable TABLE (LogTime DATETIME PRIMARY KEY, Qty INTEGER);

    INSERT INTO @TestTable (LogTime, Qty)

    SELECT '2012-01-16 00:00:00', 2345 UNION ALL

    SELECT '2012-01-16 01:00:00', 2450 UNION ALL

    SELECT '2012-01-16 02:00:00', 2540;

    WITH cte AS

    (

    SELECT LogTime, Qty,

    RN = ROW_NUMBER()

    OVER (ORDER BY LogTime)

    FROM @TestTable

    )

    SELECT curr.LogTime,

    HourValue = curr.Qty - prev.Qty

    FROM cte curr

    JOIN cte prev

    ON curr.RN = prev.RN +1;

    You know, the people that help out here are all un-paid volunteers, so please HELP US HELP YOU. Providing the DDL scripts (CREATE TABLE, CREATE INDEX, etc.) for the tables affected, and INSERT statements to put some test data into those tables that shows your problem will go a long way in getting people to look at your issue and help you out. Please include code for what you have already tried. Don't forget to include what your expected results should be, based on the sample data provided. As a bonus to you, you will get tested code back. For more details on how to get all of this into your post, please look at the first link in my signature.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • hemaljoes (1/16/2012)


    However temp table has about 12,000 rows.

    There for executing the query it take about 90 seconds.

    How can I do my query in less time.

    I believe you may have a much larger problem with your system. First, how long does the following query take?

    SELECT [TimeStamp],Data

    FROM Temp;

    The key here is that you're selecting from all rows in the table meaning that even if you were to somehow get a SEEK out of a query, it's still going to do a scan behind the scenes after if finds the "first row". That's the reason why I asked the code question. That should be your bench mark for the minimum amount of time the code should take with or without even the best indexing practices.

    The reason why I say you may have a serious large problem is because it only takes 42 seconds for the code above to run on my nearly decade old, single CPU, desktop box. Newer machines will do much better. 90 seconds is way to long. Based on the age and relatively low performance of my machine and what I saw your code do on Cadavre's machine, I'd say even 42 seconds is way too long. Since Cadavre showed the whole thing ran in about 5 seconds on a million rows and it's taking 90 seconds for the same code to run on your machine, I'm thinking that an extreme IO bottleneck or a serious memory problem is your real enemy here. We had a machine with a bad memory chip in it exhibit similar outlandish performance problems. As soon as we found and replace the bad memory chip, code that was previously taking an hour to run on it suddenly took only seconds.

    It's also possible for such a slowdown to occur if you have an active virus scanner on the server or some real bad network problems. At this point, it could be anything but its not the code you posted and it's not the problem. Some else is seriously wrong with the machine or the software on it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks everyone for the replies.

    I'm new to programming. Need bit more time to understand what some of you have told. I will try those stuff and reply ASAP.

    Thanks.

  • hemaljoes (1/16/2012)


    Thanks everyone for the replies.

    I'm new to programming. Need bit more time to understand what some of you have told. I will try those stuff and reply ASAP.

    Thanks.

    The first thing you should try is running the straight query I posted along with the time it took to run on the 12k rows. Like I said, something is really wrong if it doesn't return instantly. Hmmm... is what you're calling a "temp" table really a VIEW, perhaps?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi,

    I was able to cut down execution time to 20sec by replacing the

    DATEADD(hour,-1,temp1.Timestamp)= temp2.Timestamp

    to

    ON temp1.RecordNumber-1= temp2.RecordNumber

    for the JOINING condition

    The Original SELECT statement was

    SELECT DATE_ADD(temp1.Timestamp, INTERVAL -1 HOUR) AS hour,(temp1.Data-temp2.Data) AS hourData

    FROM temp AS temp1 JOIN temp AS temp2

    ON temp1.RecordNumber-1= temp2.RecordNumber

    Here the RecordNumber is a sequential number for a particula TLInstance. And RecordNumber and TLInstance makes the primary key for the tldata table.

    I am eager to know the cause of the time reduction.

    @jeff Moden

    The SELECT [TimeStamp],Data FROM Temp query execute in about 0.3 sec

    Thanks for the Help provided.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply