Get previous row's value - with a twist!

  • Hi all,

    I have a scenario where I am dealing with assets. Every business day we get a new asset value. I need a dataset that shows the current asset value (relating to the date on the row) and the previous date's asset value (relating to the previous record of that asset by date)

    This can easily be achieved with the LAG() function, but the issue arises when an asset is 'sold'. If for example, an asset is sold on 16th November, there will not be a record for the date 16th November. However, I still need to see a row for that asset for 16th November, where the previous date's asset value would relate to the value on 15th November, and the current asset value would be zero.

    I hope this makes sense! I somehow need to create a row for an asset on the day it was sold πŸ™‚

  • mm7861 (11/16/2015)


    Hi all,

    I have a scenario where I am dealing with assets. Every business day we get a new asset value. I need a dataset that shows the current asset value (relating to the date on the row) and the previous date's asset value (relating to the previous record of that asset by date)

    This can easily be achieved with the LAG() function, but the issue arises when an asset is 'sold'. If for example, an asset is sold on 16th November, there will not be a record for the date 16th November. However, I still need to see a row for that asset for 16th November, where the previous date's asset value would relate to the value on 15th November, and the current asset value would be zero.

    I hope this makes sense! I somehow need to create a row for an asset on the day it was sold πŸ™‚

    How about some sample data with expected output?

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Sample data attached.

    The first tab shows the raw data. Asset ID 1 was purchased on 13/11/2015 and sold on 16/11/2015. This is why the only rows that exist are between 13/11/2015 and 15/11/2015, and not 16/11/2015 because it was sold on that date.

    The next tab shows what I want my results to look like. I need to see a row for the day it was sold because it still has a relevant 'Previous Value' from 15/11/2015.

    Hope this helps.

  • mm7861 (11/16/2015)


    Sample data attached.

    The first tab shows the raw data. Asset ID 1 was purchased on 13/11/2015 and sold on 16/11/2015. This is why the only rows that exist are between 13/11/2015 and 15/11/2015, and not 16/11/2015 because it was sold on that date.

    The next tab shows what I want my results to look like. I need to see a row for the day it was sold because it still has a relevant 'Previous Value' from 15/11/2015.

    Hope this helps.

    Please read this article[/url], which describes how to create and present data for easy consumption by folks wishing to help you with your issue.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hopefully this is suitable!

    Here is the table with my raw data

    --===== If the test table already exists, drop it

    IF OBJECT_ID('TempDB..#mytable','U') IS NOT NULL

    DROP TABLE #mytable

    --===== Create the test table with

    CREATE TABLE #mytable

    (

    ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

    DateValue DATE,

    AssetID INT,

    Value INT

    )

    --===== Setup any special required conditions especially where dates are concerned

    SET DATEFORMAT DMY

    Here is some sample data that I'm working with

    --===== Insert the test data into the test table

    INSERT INTO #myTable

    (DateValue,AssetId,Value)

    VALUES ('13-Nov-2015',1,100)

    ,('14-Nov-2015',1,200)

    ,('15-Nov-2015',1,300)

    --===== Set the identity insert back to normal

    SET IDENTITY_INSERT #mytable ON

    And my desired result is in the attached spreadsheet in the tab "Query Results"

  • maybe a start

    WITH CTE

    AS (

    SELECT

    DATEADD(day , 1 , MAX(Date)) AS Date

    , Asset_ID

    , 0 AS value

    FROM yourtable

    GROUP BY Asset_ID

    UNION

    SELECT

    Date

    , Asset_ID

    , Value

    FROM yourtable)

    SELECT

    Date

    , Asset_Id

    , lag(value , 1 , 0) OVER (PARTITION BY asset_id ORDER BY date) AS previousvalue

    , Value AS CurrentValue

    FROM CTE

    ORDER BY

    Asset_ID , Date;

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Sorry to say your script wont run for the following reasons

    1) the insert into #mytable doesnt match the definition of the temp table

    2) the select statement has 3 columns the INSERT INTO has 5

    3) You do an Identity INSERT ON but dont provide the identity column

    In short the insert should read as follows, and the Identity insert on/Off can be commented out.

    INSERT INTO #myTable

    (DateValue,AssetId,Value)

    VALUES ('13-Nov-2015',1,100)

    ,('14-Nov-2015',1,200)

    ,('15-Nov-2015',1,300)

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Thanks Jason, copy and paste error! Fixed in original post πŸ™‚

    J Livingston I'll give that a go thanks!!

  • Just because LAG/LEAD is shiny and new, doesn't mean that it's the best approach for the job. If you were trying to do this on SQL 2008, you would presumably know how to approach this, but you're ignoring the obvious approach, because you want to try out your shiny new toys.

    SELECT ISNULL(m1.AssetID, m2.AssetID), ISNULL(m1.DateValue, DATEADD(DAY, 1, m2.DateValue)), ISNULL(m2.Value, 0) AS PrevValue, ISNULL(m1.Value, 0) AS CurValue

    FROM #mytable m1

    FULL OUTER JOIN #mytable m2

    ON m1.AssetID = m2.AssetID

    AND m1.DateValue = DATEADD(DAY, 1, m2.DateValue)

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Thanks Drew, out of curiosity why do you think LAG is not best for the job? I tried the LAG version and your version and the LAG version runs much quicker..

    J Livingston, your version works very well for me, however I've just come across an issue that I just spotted.

    With the part that adds a day to the MAX(Date), I need it changing so that only reflects a Date that actually exists in the data..

    So for example, if we would have sold the asset on 13/11/2015, we would want the next day to show up. However, the next day is 14/11/2015 which is a weekend which has no data, and the next Date in the data is 16/11/2015, which is what I'd need showing.

    Any way this would be possible?

  • mm7861 (11/17/2015)


    Thanks Drew, out of curiosity why do you think LAG is not best for the job? I tried the LAG version and your version and the LAG version runs much quicker..

    J Livingston, your version works very well for me, however I've just come across an issue that I just spotted.

    With the part that adds a day to the MAX(Date), I need it changing so that only reflects a Date that actually exists in the data..

    So for example, if we would have sold the asset on 13/11/2015, we would want the next day to show up. However, the next day is 14/11/2015 which is a weekend which has no data, and the next Date in the data is 16/11/2015, which is what I'd need showing.

    Any way this would be possible?

    this should give you an idea on how to acheive what you want

    DECLARE @datevalue AS datetime = '20151113';

    SELECT CASE WHEN DATENAME(dw , @datevalue) = 'Friday' THEN 3 ELSE 1 END;

    I am not sure I agree with you about LAG being quicker.....I dont think it will.

    if you give us an idea of the number of unique assets and date range for your data, then if I have time i will mock up some test data so that we can test the two methods provided so far.

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • At the moment we are using data from 1st September 2015, and have about 180,000 rows that are made up of 10,000 unique assets.

    I'm not sure how I am supposed to use that piece of code you just posted in my current CTE setup? It's only looking at business days, and not dates that actually exist in the data (these are mostly business days, however there may be some business days that don't exist)

  • CASE WHEN DATENAME(dw , MAX(Date)) = 'Friday' THEN DATEADD(day , 3 , MAX(Date))

    ELSE DATEADD(day , 1 , MAX(Date)) END as Date

    edit...corrected typo

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • mm7861 (11/17/2015)


    At the moment we are using data from 1st September 2015, and have about 180,000 rows that are made up of 10,000 unique assets.

    ok....so will each unique asset have contiguous dates (Mon-Fri) until being sold?

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • mm7861 (11/17/2015)


    Thanks Drew, out of curiosity why do you think LAG is not best for the job? I tried the LAG version and your version and the LAG version runs much quicker..

    J Livingston, your version works very well for me, however I've just come across an issue that I just spotted.

    With the part that adds a day to the MAX(Date), I need it changing so that only reflects a Date that actually exists in the data..

    So for example, if we would have sold the asset on 13/11/2015, we would want the next day to show up. However, the next day is 14/11/2015 which is a weekend which has no data, and the next Date in the data is 16/11/2015, which is what I'd need showing.

    Any way this would be possible?

    LAG/LEAD cannot add rows, so you would need some other mechanism to add the rows. Weekends do cause a problem with the FULL OUTER JOIN approach, but LAG/LEAD will cause problems with islands and gaps, because it's harder to recognize gaps.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply