Fill in missing dates in a View...

  • Hi Team,

    I have a table as follows:

    Start_DateShopProductMetric_1Metric_2Metric_3
    02/04/2018AApple51943
    16/04/2018AApple487492
    23/04/2018AApple405626
    30/04/2018AApple409986
    02/04/2018BApple279564
    09/04/2018BApple59455
    16/04/2018BApple528260
    30/04/2018BApple289577
    02/04/2018APear921940
    16/04/2018APear31577
    30/04/2018APear357668
    02/04/2018BPear379842
    16/04/2018BPear62272

    Start_Date is always a Monday date.

    Here's how I need to look:

    Start_DateShopProductMetric_1Metric_2Metric_3
    02/04/2018AApple51943
    09/04/2018AApple000
    16/04/2018AApple487492
    23/04/2018AApple405626
    30/04/2018AApple409986
    02/04/2018BApple279564
    09/04/2018BApple59455
    16/04/2018BApple528260
    23/04/2018BApple000
    30/04/2018BApple289577
    02/04/2018APear921940
    09/04/2018APear000
    16/04/2018APear31577
    23/04/2018APear000
    30/04/2018APear357668
    02/04/2018BPear379842
    09/04/2018BPear000
    16/04/2018BPear62272
    23/04/2018BPear000
    30/04/2018BPear000

    Notice how missing dates in the first table now have records with 0s shown for each Metric.

    The idea in my head is to create a temp table of all possible Monday dates between min Start_Date and max Start_Date, and then do some kind of join in order to generate the missing records.

    Hope someone can help!

    Cheers,

    Matty

  • Variations on this theme usually work:

    SELECT *

    FROM (

    SELECT [Start_Date]

    FROM MyTable

    GROUP BY [Start_Date]

    ) d

    LEFT JOIN MyTable t ON t.[Start_Date] = d.[Start_Date]

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi ChrisM,

    Thanks for the response.

    I'd already tried something similar to that and it's returning the same number of records as exists in the original table (even though missing records do exist).

    Any other thoughts?

    Cheers,

    Matty

  • Matty_H - Friday, April 6, 2018 4:49 AM

    Hi ChrisM,

    Thanks for the response.

    I'd already tried something similar to that and it's returning the same number of records as exists in the original table (even though missing records do exist).

    Any other thoughts?

    Cheers,

    Matty

    Can you post your code Matty?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work - Friday, April 6, 2018 4:52 AM

    Matty_H - Friday, April 6, 2018 4:49 AM

    Hi ChrisM,

    Thanks for the response.

    I'd already tried something similar to that and it's returning the same number of records as exists in the original table (even though missing records do exist).

    Any other thoughts?

    Cheers,

    Matty

    Can you post your code Matty?

    Hi,

    I've used the code you provided and the result is the same: no change to the record count.  I'd originally tried a sub query using DISTINCT, but same result.

    Cheers,

    Matty

  • Here you go, Matty:
    WITH MyTable AS (
     SELECT * FROM (
     VALUES
     ('02/04/2018', 'A', 'Apple', 51, 94, 3),
     ('16/04/2018', 'A', 'Apple', 48, 74, 92),
     ('23/04/2018', 'A', 'Apple', 40, 56, 26),
     ('30/04/2018', 'A', 'Apple', 40, 99, 86),
     ('02/04/2018', 'B', 'Apple', 27, 95, 64),
     ('09/04/2018', 'B', 'Apple', 59, 4, 55),
     ('16/04/2018', 'B', 'Apple', 52, 82, 60),
     ('30/04/2018', 'B', 'Apple', 28, 95, 77),
     ('02/04/2018', 'A', 'Pear', 92, 19, 40),
     ('16/04/2018', 'A', 'Pear', 3, 15, 77),
     ('30/04/2018', 'A', 'Pear', 35, 76, 68),
     ('02/04/2018', 'B', 'Pear', 37, 98, 42),
     ('16/04/2018', 'B', 'Pear', 62, 2, 72)
     ) d ([Start_Date], Shop, Product, Metric_1, Metric_2, Metric_3)
    )
    SELECT d.[Start_Date], d.Shop, t.Product, t.Metric_1, t.Metric_2, t.Metric_3
    FROM (
     SELECT [Start_Date], Shop
     FROM (SELECT [Start_Date] FROM MyTable GROUP BY [Start_Date]) d1
     CROSS JOIN (SELECT Shop FROM MyTable GROUP BY Shop) d2
    ) d
    LEFT JOIN MyTable t ON t.[Start_Date] = d.[Start_Date] AND t.Shop = d.Shop
    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Matty_H - Friday, April 6, 2018 5:00 AM

    ChrisM@Work - Friday, April 6, 2018 4:52 AM

    Matty_H - Friday, April 6, 2018 4:49 AM

    Hi ChrisM,

    Thanks for the response.

    I'd already tried something similar to that and it's returning the same number of records as exists in the original table (even though missing records do exist).

    Any other thoughts?

    Cheers,

    Matty

    Can you post your code Matty?

    Hi,

    I've used the code you provided and the result is the same: no change to the record count.  I'd originally tried a sub query using DISTINCT, but same result.

    Cheers,

    Matty

    This might have been what you had in mind.   I'm not sure it will perform as well as Chris's code, but take a look and see what it's doing...  SET DATEFORMAT DMY;

    CREATE TABLE #ProductMetrics (
        StartDate date,
        Shop char(1),
        Product varchar(10),
        Metric_1 int,
        Metric_2 int,
        Metric_3 int
    );
    INSERT INTO #ProductMetrics (StartDate, Shop, Product, Metric_1, Metric_2, Metric_3)
    SELECT StartDate, Shop, Product, Metric_1, Metric_2, Metric_3
    FROM (
        VALUES    ('02/04/2018', 'A', 'Apple', 51, 94, 3),
                ('16/04/2018', 'A', 'Apple', 48, 74, 92),
                ('23/04/2018', 'A', 'Apple', 40, 56, 26),
                ('30/04/2018', 'A', 'Apple', 40, 99, 86),
                ('02/04/2018', 'B', 'Apple', 27, 95, 64),
                ('09/04/2018', 'B', 'Apple', 59, 4, 55),
                ('16/04/2018', 'B', 'Apple', 52, 82, 60),
                ('30/04/2018', 'B', 'Apple', 28, 95, 77),
                ('02/04/2018', 'A', 'Pear', 92, 19, 40),
                ('16/04/2018', 'A', 'Pear', 3, 15, 77),
                ('30/04/2018', 'A', 'Pear', 35, 76, 68),
                ('02/04/2018', 'B', 'Pear', 37, 98, 42),
                ('16/04/2018', 'B', 'Pear', 62, 2, 72)
        ) AS X (StartDate, Shop, Product, Metric_1, Metric_2, Metric_3);

    DECLARE @StartDate AS date = '02/04/2018';
    DECLARE @MaxDate AS date = '30/04/2018';

    WITH Numbers AS (

        SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL
        SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL
        SELECT 1 AS N UNION ALL SELECT 1 AS N UNION ALL SELECT 1 AS N
    ),
        AllDates AS (

            SELECT S.Shop, P.Product, X.TheDate
            FROM (
                SELECT DATEADD(day, 7 * (ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) - 1), @StartDate) AS TheDate
                FROM Numbers AS N1
                    CROSS APPLY Numbers AS N2
                ) AS X
                CROSS APPLY (
                    SELECT DISTINCT Shop
                    FROM #ProductMetrics
                    ) AS S
                CROSS APPLY (
                    SELECT DISTINCT Product
                    FROM #ProductMetrics
                    ) AS P
            WHERE X.TheDate <= @MaxDate
    )
    SELECT D.TheDate AS StartDate, D.Shop, D.Product,
        ISNULL(PM.Metric_1, 0) AS Metric_1,
        ISNULL(PM.Metric_2, 0) AS Metric_2,
        ISNULL(PM.Metric_3, 0) AS Metric_3
    FROM AllDates AS D
        LEFT OUTER JOIN #ProductMetrics AS PM
            ON D.Shop = PM.Shop
            AND D.TheDate = PM.StartDate
            AND D.Product = PM.Product
    ORDER BY D.Shop, D.Product, D.TheDate;

    DROP TABLE #ProductMetrics;
    SET DATEFORMAT MDY;

    I needed the SET DATEFORMAT because I'm in the US and our dates don't use day first - they use month.    You probably won't need that, but it couldn't hurt.  Best way to specify dates is YYYYMMDD.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply