Storing the datediff between a row and the previous row

  • So I want to store the difference in milliseconds between rows in a column

    Table format

    ID , Label, Realdate, datediff .

    I've tried calculating the datediff using the following statement but if I try and combine with an update to the timediff column  Im getting no joy

     


    SELECT id,label,realdate, DATEDIFF(MILLISECOND, pDataDate, realdate) as timediff
    FROM


    ( SELECT id,label,realdate,
    LAG(realdate) OVER (ORDER BY realdate) pDataDate
    FROM sample_node
    where id = 1
    and label = 'metric2'
    ) a

    Here is the table schema and sample input


    create table sample_node
    (
    id int,
    label varchar(10) ,
    realdate datetime ,
    datedif int
    )

    insert sample_node (id,label,realdate)
    values (1,'metric1','2021-03-11 00:42:00.000')
    , (1,'metric2','2021-03-11 00:42:03.000')
    , (1,'metric2','2021-03-11 00:42:07.000')
    , (1,'metric2','2021-03-11 00:42:08.000')

    , (2,'metric1','2021-03-12 00:43:00.000')
    , (2,'metric1','2021-03-12 00:44:03.000')
    , (2,'metric1','2021-03-21 00:44:05.000')

    I realise I'll have to maybe cursor to step through the table to get it to only compare groups of ids & labels.

     

    But some help with the actual update in the first place would be appreciated

     

    thanks Simon

  • Something like this?

    WITH calcs
    AS (SELECT id
    ,label
    ,realdate
    ,pDataDate = LAG(realdate) OVER (PARTITION BY id, label ORDER BY realdate)
    FROM sample_node)
    UPDATE trg
    SET trg.datedif = DATEDIFF(MILLISECOND, calcs.pDataDate, trg.realdate)
    FROM sample_node trg
    JOIN calcs
    ON calcs.id = trg.id
    AND calcs.label = trg.label AND calcs.realdate = trg.realdate
    WHERE calcs.pDataDate IS NOT NULL;

    SELECT *
    FROM sample_node sn;

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • I always forget about updating the CTE directly. Here is a better version:

    WITH calcs
    AS (SELECT id
    ,label
    ,realdate
    ,pDataDate = LAG(realdate) OVER (PARTITION BY id, label ORDER BY realdate)
    ,datedif
    FROM #sample_node)
    UPDATE calcs
    SET calcs.datedif = DATEDIFF(MILLISECOND, calcs.pDataDate, calcs.realdate)
    WHERE calcs.pDataDate IS NOT NULL;

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • thank you Phil much appreciated

  • Actually neither version works 🙁   I've ran both and here is what they return and here is what I would expect (excuse the date format I copied via excel)

     

    date example

  • Run this code:

    DROP TABLE IF EXISTS #sample_node;

    CREATE TABLE #sample_node
    (
    id INT
    ,label VARCHAR(10)
    ,realdate DATETIME
    ,datedif INT
    );

    INSERT #sample_node
    (
    id
    ,label
    ,realdate
    )
    VALUES
    (1, 'metric1', '2021-03-11 00:42:00.000')
    ,(1, 'metric2', '2021-03-11 00:42:03.000')
    ,(1, 'metric2', '2021-03-11 00:42:07.000')
    ,(1, 'metric2', '2021-03-11 00:42:08.000')
    ,(2, 'metric1', '2021-03-12 00:43:00.000')
    ,(2, 'metric1', '2021-03-12 00:44:03.000')
    ,(2, 'metric1', '2021-03-21 00:44:05.000');


    WITH calcs
    AS (SELECT id
    ,label
    ,realdate
    ,pDataDate = LAG(realdate) OVER (PARTITION BY id, label ORDER BY realdate)
    ,datedif
    FROM #sample_node)
    UPDATE calcs
    SET calcs.datedif = DATEDIFF(MILLISECOND, calcs.pDataDate, calcs.realdate)
    WHERE calcs.pDataDate IS NOT NULL;

    SELECT *
    FROM #sample_node sn;

    It produces this:

    2021-07-17_13-24-17

    Tell me what is wrong with it, please.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Hi Phil, This one looks good. Since the datedif field is null on population and desired output has a zero for the first row of each partition, the poster might want something like an ISNULL added to the update:

    WITH calcs AS (
    SELECT id,label,realdate,
    pDataDate = LAG(realdate) OVER (PARTITION BY id, label ORDER BY realdate),
    datedif
    FROM #sample_node)
    UPDATE calcs
    SET calcs.datedif = ISNULL(DATEDIFF(MILLISECOND, calcs.pDataDate, calcs.realdate),0)
    --WHERE calcs.pDataDate IS NOT NULL;
    ;

    Also, the first submitted query with the join has trouble when the partition has multiple records with the same realdate, resulting in a bit of a cross join: calcs.realdate = trg.realdate. Oh, the times those date columns have gotten me. LOL

     

  • Phil, apologies for the delay it took some working out !

     

    The code is fine , in the real table there is a third column that only contains rows for certain labels so I needed to add this to the code . It only populated for like 1 in a million rows hence I was blissfully unaware of it !

     

    many thanks Simon

  • I haven't been on SQL.Central for a long time, so i just noticed the post.

    Not sure if this is the simplest solution, but for sure is the shortest for typing:

    SELECT [LAbel], id, Realdate
    , DateDif = - datediff(millisecond
    ,realdate
    , nullif (Lag(RealDate,1,0)
    OVER(PArtition BY [Label]
    ORDER BY id, realdate),'1900-01-01 00:00:00.000')
    )
    FROM #sample_node_orig
    ;

    Sample data is from the original post, just table name changed:

    DROP TABLE IF EXISTS #sample_node_orig;

    CREATE TABLE #sample_node_orig
    (
    id INT
    ,label VARCHAR(10)
    ,realdate DATETIME
    ,datedif INT
    );

    INSERT #sample_node_orig (id,label,realdate)
    values (1,'metric1','2021-03-11 00:42:00.000')
    , (1,'metric2','2021-03-11 00:42:03.000')
    , (1,'metric2','2021-03-11 00:42:07.000')
    , (1,'metric2','2021-03-11 00:42:08.000')

    , (2,'metric1','2021-03-12 00:43:00.000')
    , (2,'metric1','2021-03-12 00:44:03.000')
    , (2,'metric1','2021-03-21 00:44:05.000')

    Cheers

    Zidar's Theorem: The best code is no code at all...

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply