Detecting change week over week in SQL table

  • Hello everybody,

    trying to figure what is the best way to figure out change between snapshots of project records. Snapshots are taken weekly. Project name is unique, dates are always 7 days apart.

    We have a process that generates snapshots of products and product values and I need to detect all types of changes:

    1. When project was added from previous week - e.g. it did not exist on 12/24 and it exists on 12/31 plus calculate the value change (net addition)
    2. Project was removed - e.g. it existed on 12/24 and does not exist on 12/31 plus calculate the value change (net subtraction)
    3. Project value has changes - existed on 12/24 and exists on 12/31 but value has changed plus calculate the value change.

    The snapshots get generated every week. Attached is an example of the table and some sample data.

    What I need to detect from the sample would be:

    1. Project B was removed for 12/31/2017
    2. Project C value has changed on 01/07/2018
    3. Project D was added on 01/07/2018.

    Please, suggest solutions as you see fit.

    CREATE TABLE [dbo].[Snapshots](
        [ID] [bigint] NULL,
        [SnapshotDate] [date] NULL,
        [ProjectName] [varchar](50) NULL,
        [ProjectValue] [decimal](18, 2) NULL
    ) ON [PRIMARY]
    GO
    INSERT [dbo].[Snapshots] ([ID], [SnapshotDate], [ProjectName], [ProjectValue]) VALUES (1, CAST(N'2017-12-24' AS Date), N'Project A', CAST(10000.00 AS Decimal(18, 2)))
    INSERT [dbo].[Snapshots] ([ID], [SnapshotDate], [ProjectName], [ProjectValue]) VALUES (2, CAST(N'2017-12-24' AS Date), N'Project B', CAST(15000.00 AS Decimal(18, 2)))
    INSERT [dbo].[Snapshots] ([ID], [SnapshotDate], [ProjectName], [ProjectValue]) VALUES (3, CAST(N'2017-12-24' AS Date), N'Project C', CAST(20000.00 AS Decimal(18, 2)))
    INSERT [dbo].[Snapshots] ([ID], [SnapshotDate], [ProjectName], [ProjectValue]) VALUES (4, CAST(N'2017-12-31' AS Date), N'Project A', CAST(10000.00 AS Decimal(18, 2)))
    INSERT [dbo].[Snapshots] ([ID], [SnapshotDate], [ProjectName], [ProjectValue]) VALUES (5, CAST(N'2017-12-31' AS Date), N'Project C', CAST(20000.00 AS Decimal(18, 2)))
    INSERT [dbo].[Snapshots] ([ID], [SnapshotDate], [ProjectName], [ProjectValue]) VALUES (6, CAST(N'2018-01-07' AS Date), N'Project A', CAST(10000.00 AS Decimal(18, 2)))
    INSERT [dbo].[Snapshots] ([ID], [SnapshotDate], [ProjectName], [ProjectValue]) VALUES (7, CAST(N'2018-01-07' AS Date), N'Project C', CAST(15000.00 AS Decimal(18, 2)))
    INSERT [dbo].[Snapshots] ([ID], [SnapshotDate], [ProjectName], [ProjectValue]) VALUES (8, CAST(N'2018-01-07' AS Date), N'Project D', CAST(50000.00 AS Decimal(18, 2)))

  • And what have you tried so far to solve your problem?  Remember, we are just volunteers here and you are the one that will have to support any solution that may be provided.

  • Hi Lynn,

    excellent question. I tried two different solutions one using standard left outer join (self join) and a second one with two CTEs and joins using current and previous date.

    Can't figure out how to do this in a way where it can scale as well.

    Totally understand that everybody is a volunteer and greatly appreciate everybody's time and input.

    thanks,

    vecerda

  • vecerda - Tuesday, January 9, 2018 4:37 PM

    Hi Lynn,

    excellent question. I tried two different solutions one using standard left outer join (self join) and a second one with two CTEs and joins using current and previous date.

    Can't figure out how to do this in a way where it can scale as well.

    Totally understand that everybody is a volunteer and greatly appreciate everybody's time and input.

    thanks,

    vecerda

    How about posting the code, not just a verbal summary of what you did.

  • Where I started is this:


    select s1.Id, s1.snapshotdate, s1.ProjectName, ISNULL(s1.ProjectValue,0)-ISNULL(s2.ProjectValue,0) as Change from snapshots s1 LEFT OUTER JOIN snapshots s2
    on s1.SnapshotDate=dateadd(d,7,s2.snapshotdate) and s1.ProjectName=s2.ProjectName
    ,

    but issue here is that this way I am not detecting the removal of Project B on 12/31.

    Thanks for all of your help,

    petr

  • The fact that this table keeps a full history of all projects makes it a bit challenging. But hopefully this snippet will get you started. Let's say you want to check this table once a day to detect changes in project value. Of course, you will need to find a way to detect situation where a project (B) disappears after a certain date.

    ;WITH [f] AS (
      SELECT
        ROW_NUMBER() OVER (PARTITION BY [ProjectName] ORDER BY SnapshotDate ASC) AS [RowNumber],
        [SnapshotDate],
        [ProjectName],
        [ProjectValue]
      FROM [Snapshots] )

    SELECT
      [f].[ProjectName] AS [ProjectName],
      [f].[SnapshotDate] AS [SnapshotDate],
      [n].[SnapshotDate] AS [Date Changed],
      [f].[ProjectValue] AS [Previous Value],
      [n].[ProjectValue] AS [New Value]
    FROM [f]
    LEFT OUTER JOIN [f] AS [n]
    ON [f].[ProjectName] = [n].[ProjectName]
    AND [f].[RowNumber] + 1 = [n].[RowNumber]
    WHERE ([f].[ProjectValue] <> [n].[ProjectValue] )

    .

  • Okay, I have to leave but I am hoping that this code may help you figure out what you need to do, and if not, maybe someone else can pick up from here.


    SELECT * FROM [dbo].[Snapshots]


    DECLARE @CurrentSnapshot DATE
            , @PreviousSnapshot DATE;

    SET @CurrentSnapshot = '2017-12-31';
    SET @PreviousSnapshot = DATEADD(DAY,-7,@CurrentSnapshot);

    SELECT @CurrentSnapshot, @PreviousSnapshot;

    (SELECT .[ProjectName], .[ProjectValue] FROM [dbo].[Snapshots] AS WHERE .[SnapshotDate] = @PreviousSnapshot
     EXCEPT
     SELECT.[ProjectName], .[ProjectValue] FROM [dbo].[Snapshots] AS WHERE .[SnapshotDate] = @CurrentSnapshot)
    UNION
    (SELECT .[ProjectName], .[ProjectValue] FROM [dbo].[Snapshots] AS WHERE .[SnapshotDate] = @CurrentSnapshot
     EXCEPT
     SELECT.[ProjectName], .[ProjectValue] FROM [dbo].[Snapshots] AS WHERE .[SnapshotDate] = @PreviousSnapshot);

    SET @CurrentSnapshot = '2018-01-07';
    SET @PreviousSnapshot = DATEADD(DAY,-7,@CurrentSnapshot);

    SELECT @CurrentSnapshot, @PreviousSnapshot;

    (SELECT .[ProjectName], .[ProjectValue] FROM [dbo].[Snapshots] AS WHERE .[SnapshotDate] = @PreviousSnapshot
     EXCEPT
     SELECT.[ProjectName], .[ProjectValue] FROM [dbo].[Snapshots] AS WHERE .[SnapshotDate] = @CurrentSnapshot)
    UNION
    (SELECT .[ProjectName], .[ProjectValue] FROM [dbo].[Snapshots] AS WHERE .[SnapshotDate] = @CurrentSnapshot
     EXCEPT
     SELECT.[ProjectName], .[ProjectValue] FROM [dbo].[Snapshots] AS WHERE .[SnapshotDate] = @PreviousSnapshot);

  • Something like this is one way, with a FULL OUTER JOIN.

    SELECT ChangeDate =ISNULL(post.SnapshotDate,DATEADD(DAY,7,pre.SnapshotDate)),
           ProjectName=COALESCE(pre.ProjectName,post.ProjectName),
           ChangeType =CASE WHEN post.SnapshotDate IS NULL THEN 'Removed'
                            WHEN pre.SnapshotDate IS NULL THEN 'Added'
                            ELSE 'Value Change'
                       END,
           ValueChange=ISNULL(post.ProjectValue,0)-ISNULL(pre.ProjectValue,0)
    FROM  Snapshots pre
           FULL OUTER JOIN
           Snapshots post ON post.ProjectName=pre.ProjectName
                             AND
                             post.SnapshotDate=DATEADD(DAY,7,pre.SnapshotDate)
    WHERE  pre.ProjectName+post.ProjectName IS NULL OR pre.ProjectValue<>post.ProjectValue
    ORDER BY ChangeDate ASC, ProjectName ASC;

    Depending on your needs you may want to tweak it so that it doesn't show all the projects on the first recorded week being added and all projects being removed after the last recorded week, but this should at least give you the core of one approach.

    Cheers!

  • This is pretty close but does not identify the ones that have been removed. I feel like this should be relatively simple but I can't figure out the right kind of logic.

    Lynn, appreciate the help, but this does not solve the problem.

    Quite challenging even though it feels simple 🙂

    Vecerda

  • Jacob, this looks very very promising. I am going to play with it and see if I can adjust it to the real life scenario, but conceptually this looks like the right kind of solution.

    Thanks you so very much,

    petr

  • vecerda - Tuesday, January 9, 2018 5:58 PM

    This is pretty close but does not identify the ones that have been removed. I feel like this should be relatively simple but I can't figure out the right kind of logic.

    Lynn, appreciate the help, but this does not solve the problem.

    Quite challenging even though it feels simple 🙂

    Vecerda

    Ideally (in my simple mind), the comparison should be between a before-image and after-image. If you could somehow create a before-image and after-image from the same table (like using the snapshotdate), then it shouldn't be hard to identify something that is missing from the after-image.

  • Can you add a table? Our system has audit tables that capture the before/after picture. Basically exact same columns as your actual project table, with Changetype and TimeOfChange columns added (table already has who created/updated and when, so it captures that every time something changes anyway). Changetype is Inserted/changeBefore/changeAfter/Deleted, and triggers update the audit table on all INSERT/UPDATE/DELETE actions. you'll still have to do the comparison to figure out what changed, but it's easy to see it when it's stepped out.

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

  • Here's another approach that only requires reading the table once, but it's not quite as straightforward as doing the FULL OUTER JOIN; whether such an approach will work better for your needs will come down to the age-old "It depends..." 🙂 :


    WITH CTE AS
    (
    SELECT
    ChangeDate=calc_date
    ,ProjectName
    ,ChangeType=CASE WHEN COUNT(*)=2 THEN 'Value Change'
                     WHEN COUNT(CASE WHEN calc_date>SnapshotDate THEN 1 END)=1 THEN 'Removed'
                     ELSE 'Added'
                END
    ,ValueChange=CASE WHEN COUNT(*)=2 THEN MAX(CASE WHEN calc_date=SnapshotDate THEN ProjectValue END)-MAX(CASE WHEN calc_date>SnapshotDate THEN ProjectValue END)
                      WHEN COUNT(CASE WHEN calc_date>SnapshotDate THEN 1 END)=1 THEN -MAX(ProjectValue)
                      ELSE MAX(ProjectValue)
                 END
    FROM Snapshots
         CROSS APPLY
         (VALUES(SnapshotDate, ProjectValue),(DATEADD(DAY,7,SnapshotDate),ProjectValue))x(calc_date,calc_value)
    GROUP BY ProjectName,calc_date
    )

    SELECT *
    FROM CTE
    WHERE ValueChange<>0
    ORDER BY ChangeDAte ASC, ProjectName ASC;

    Cheers!

  • Using LEAD will allow you to check the value of the next occurrence

    WITH cteLead AS (
    SELECT [ID], [SnapshotDate], [ProjectName], [ProjectValue]
      , [NextVal] = LEAD([ProjectValue]) OVER (PARTITION BY [ProjectName] ORDER BY [SnapshotDate])
      , [MaxDate] = MAX([SnapshotDate]) OVER ()
    FROM dbo.Snapshots
    )
    SELECT *
    FROM cteLead
    WHERE ([ProjectValue] != [NextVal] OR [NextVal] IS NULL)
    AND [SnapshotDate] < [MaxDate]

  • vecerda - Tuesday, January 9, 2018 5:58 PM

    This is pretty close but does not identify the ones that have been removed. I feel like this should be relatively simple but I can't figure out the right kind of logic.

    Lynn, appreciate the help, but this does not solve the problem.

    Quite challenging even though it feels simple 🙂

    Vecerda

    What I posted was not a solution.  It was something for you to look at and work with to determine if it could be the basis for a solution.  It looks to me that others are offering other solutions that are better.  Just remember, you have to support what ever solution you find that meets your requirements.  No one on this site is responsible for providing you support.  My simple recommendation is this; if you don't understand it, if you can't explain how it works to your coworkers or supervisor, then don't use it in a production environment.  I have no problem with people using solutions or modifying solutions from the internet, but I do have a problem if the can't support it themselves.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply