Getting the difference from current row and previous row

Question

Post reply

Getting the difference from current row and previous row

hemaljoes

Old Hand

Points: 355
More actions
January 16, 2012 at 4:49 am

#249086

Hi,
I have a database table which logs hourly value from a meter. I need to get the difference between the current row value and the previous row value to get the actual hourly value.
for example
timestamp value
2012-01-16 00:00:00 2345
2012-01-16 01:00:00 2450
2012-01-16 02:00:00 2540
Need to get
hour value
2012-01-16 00:00:00 105
2012-01-16 01:00:00 90
Currently I am using a SQL statement as below
SELECT DATEADD(hour,-1,temp1.Timestamp) AS hour,(temp1.Data-temp2.Data) AS hourData
FROM temp AS temp1 JOIN temp AS temp2
ON DATEADD(hour,-1,temp1.Timestamp)= temp2.Timestamp
However temp table has about 12,000 rows.
There for executing the query it take about 90 seconds.
How can I do my query in less time.
Thanks in advanced.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Cadavre SSC-Forever Points: 41690 More actions · Answer 1

Your issue is probably due to your index.

Try this replication script: -

BEGIN TRAN

SET NOCOUNT ON

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment

END

--1,000,000 Random rows of data

;WITH CTE AS (

SELECT DATEADD(HOUR,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),'2000-01-01') AS [timestamp]

FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3)

SELECT TOP 1000000 IDENTITY(INT,1,1) AS ID, [timestamp],

(ABS(CHECKSUM(NEWID())) % 200) + 1 AS data

INTO #testEnvironment

FROM CTE

PRINT '========== BASELINE =========='

SET STATISTICS TIME ON

SELECT COUNT(*) FROM #testEnvironment

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

PRINT '========== YOUR QUERY =========='

SET STATISTICS TIME ON

SELECT DATEADD(hour,-1,temp1.[timestamp]) AS [hour],(temp1.data-temp2.data) AS hourData

FROM #testEnvironment temp1

JOIN #testEnvironment temp2 ON DATEADD(hour,-1,temp1.[timestamp])= temp2.[timestamp]

SET STATISTICS TIME OFF

PRINT REPLICATE('=',80)

ROLLBACK

So that's using your query on 1 million rows, not just 12,000.

Here are the times on my box

========== BASELINE ==========

SQL Server Execution Times:

CPU time = 94 ms, elapsed time = 94 ms.

================================================================================

========== YOUR QUERY ==========

SQL Server Execution Times:

CPU time = 2545 ms, elapsed time = 4717 ms.

================================================================================

So nearly 5 seconds on 1 million rows of data.

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Paul White SSC Guru Points: 150468 More actions · Answer 2

hemaljoes (1/16/2012)
However temp table has about 12,000 rows.
There for executing the query it take about 90 seconds.
How can I do my query in less time.

Are you able to provide us with an execution plan? Run the query with Actual Execution Plan on in SSMS, right click the graphical plan after execution finishes, save to a file, and attach it to your next post. The problem should be quite clear from that. If you are able to script the table definition and current indexes, that would be helpful too.

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

WayneS SSC Guru Points: 95461 More actions · Answer 3

How does this work for you?

DECLARE @TestTable TABLE (LogTime DATETIME PRIMARY KEY, Qty INTEGER);

INSERT INTO @TestTable (LogTime, Qty)

SELECT '2012-01-16 00:00:00', 2345 UNION ALL

SELECT '2012-01-16 01:00:00', 2450 UNION ALL

SELECT '2012-01-16 02:00:00', 2540;

WITH cte AS

(

SELECT LogTime, Qty,

RN = ROW_NUMBER()

OVER (ORDER BY LogTime)

FROM @TestTable

)

SELECT curr.LogTime,

HourValue = curr.Qty - prev.Qty

FROM cte curr

JOIN cte prev

ON curr.RN = prev.RN +1;

You know, the people that help out here are all un-paid volunteers, so please HELP US HELP YOU. Providing the DDL scripts (CREATE TABLE, CREATE INDEX, etc.) for the tables affected, and INSERT statements to put some test data into those tables that shows your problem will go a long way in getting people to look at your issue and help you out. Please include code for what you have already tried. Don't forget to include what your expected results should be, based on the sample data provided. As a bonus to you, you will get tested code back. For more details on how to get all of this into your post, please look at the first link in my signature.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Jeff Moden SSC Guru Points: 1004649 More actions · Answer 4

hemaljoes (1/16/2012)
However temp table has about 12,000 rows.
There for executing the query it take about 90 seconds.
How can I do my query in less time.

I believe you may have a much larger problem with your system. First, how long does the following query take?

SELECT [TimeStamp],Data

FROM Temp;

The key here is that you're selecting from all rows in the table meaning that even if you were to somehow get a SEEK out of a query, it's still going to do a scan behind the scenes after if finds the "first row". That's the reason why I asked the code question. That should be your bench mark for the minimum amount of time the code should take with or without even the best indexing practices.

The reason why I say you may have a serious large problem is because it only takes 42 seconds for the code above to run on my nearly decade old, single CPU, desktop box. Newer machines will do much better. 90 seconds is way to long. Based on the age and relatively low performance of my machine and what I saw your code do on Cadavre's machine, I'd say even 42 seconds is way too long. Since Cadavre showed the whole thing ran in about 5 seconds on a million rows and it's taking 90 seconds for the same code to run on your machine, I'm thinking that an extreme IO bottleneck or a serious memory problem is your real enemy here. We had a machine with a bad memory chip in it exhibit similar outlandish performance problems. As soon as we found and replace the bad memory chip, code that was previously taking an hour to run on it suddenly took only seconds.

It's also possible for such a slowdown to occur if you have an active virus scanner on the server or some real bad network problems. At this point, it could be anything but its not the code you posted and it's not the problem. Some else is seriously wrong with the machine or the software on it.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

hemaljoes Old Hand Points: 355 More actions · Answer 5

Thanks everyone for the replies.

I'm new to programming. Need bit more time to understand what some of you have told. I will try those stuff and reply ASAP.

Thanks.

Jeff Moden SSC Guru Points: 1004649 More actions · Answer 6

hemaljoes (1/16/2012)
Thanks everyone for the replies.
I'm new to programming. Need bit more time to understand what some of you have told. I will try those stuff and reply ASAP.
Thanks.

The first thing you should try is running the straight query I posted along with the time it took to run on the 12k rows. Like I said, something is really wrong if it doesn't return instantly. Hmmm... is what you're calling a "temp" table really a VIEW, perhaps?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

hemaljoes Old Hand Points: 355 More actions · Answer 7

Hi,

I was able to cut down execution time to 20sec by replacing the

DATEADD(hour,-1,temp1.Timestamp)= temp2.Timestamp

to

ON temp1.RecordNumber-1= temp2.RecordNumber

for the JOINING condition

The Original SELECT statement was

SELECT DATE_ADD(temp1.Timestamp, INTERVAL -1 HOUR) AS hour,(temp1.Data-temp2.Data) AS hourData

FROM temp AS temp1 JOIN temp AS temp2

ON temp1.RecordNumber-1= temp2.RecordNumber

Here the RecordNumber is a sequential number for a particula TLInstance. And RecordNumber and TLInstance makes the primary key for the tldata table.

I am eager to know the cause of the time reduction.

@jeff Moden

The SELECT [TimeStamp],Data FROM Temp query execute in about 0.3 sec

Thanks for the Help provided.