Selecting non-overlapping Date Ranges

  • I want to select non-overlapping date ranges, which share a common value.  I have prepared the following to illustrate my problem

    DROP TABLE IF EXISTS #SampleRanges;
    CREATE TABLE #SampleRanges (
    RowNum INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
    SampleDate DATETIME NOT NULL,
    SampleValue BIGINT NOT NULL);

    INSERT INTO #SampleRanges (SampleDate, SampleValue)
    SELECT '2018-08-30 05:07:17', 31267 UNION ALL
    SELECT '2018-08-30 05:07:18', 31267 UNION ALL
    SELECT '2018-08-30 05:12:35', 31267 UNION ALL
    SELECT '2018-08-30 07:10:10', 134159490 UNION ALL
    SELECT '2018-08-30 07:13:54', 134159490 UNION ALL
    SELECT '2018-08-30 07:17:19', 134159490 UNION ALL
    SELECT '2018-08-30 07:22:16', 134159490 UNION ALL
    SELECT '2018-08-30 07:25:42', 134159490 UNION ALL
    SELECT '2018-08-30 07:28:03', 64498 UNION ALL
    SELECT '2018-08-30 07:28:07', 64498 UNION ALL
    SELECT '2018-08-30 08:16:14', 64498 UNION ALL
    SELECT '2018-08-30 08:16:37', 134159490 UNION ALL
    SELECT '2018-08-30 08:20:27', 134159490 UNION ALL
    SELECT '2018-08-30 08:27:53', 460350737 UNION ALL
    SELECT '2018-08-30 08:28:04', 15375002 UNION ALL
    SELECT '2018-08-30 08:28:05', 15375002 UNION ALL
    SELECT '2018-08-30 08:28:22', 134159490;

    -- DESIRED OUTPUT
    /*
    RangeStartDate, RangeEndDate, SampleValue, ValueCount
    '2018-08-30 05:07:17','2018-08-30 05:12:35', 31267, 3
    '2018-08-30 07:10:10','2018-08-30 07:25:42', 134159490, 5
    '2018-08-30 07:28:03','2018-08-30 08:16:14', 64498, 2
    '2018-08-30 08:16:37','2018-08-30 08:20:27', 134159490, 2
    '2018-08-30 08:27:53','2018-08-30 08:27:53', 460350737, 1
    '2018-08-30 08:28:04','2018-08-30 08:28:05',15375002, 2
    '2018-08-30 08:28:22','2018-08-30 08:28:22', 134159490, 1
    */

    -- what I have tried
    SELECT MIN(S1.SampleDate) AS RangeStartDate, MAX(S1.SampleDate) AS RangeEndDate, S1.SampleValue, COUNT(S1.SampleValue) AS ValueCount
    FROM #SampleRanges AS S1
    -- LEFT OUTER JOIN #SampleRanges AS S2 ON S1.SampleValue = S2.SampleValue AND S1.RowNum+1 = S2.RowNum
    GROUP BY S1.SampleValue
    ORDER BY RangeStartDate
    -- This is nearly there, but produces a single range for Value 134159490, which overlaps other ranges, instead of 3 non-overlapping ranges for thate Value.
    -- I know I have to somehow JOIN the table to itself, but the JOIN I have written makes no difference
  • This is a gaps and islands problem


    WITH Src AS (
    SELECT SampleDate, SampleValue,
    ROW_NUMBER() OVER(ORDER BY SampleDate) -
    ROW_NUMBER() OVER(PARTITION BY SampleValue ORDER BY SampleDate) AS grp
    FROM #SampleRanges
    )
    SELECT MIN(SampleDate) AS RangeStartDate,
    MAX(SampleDate) AS RangeEndDate,
    SampleValue,
    COUNT(*) AS ValueCount
    FROM Src
    GROUP BY grp,SampleValue
    ORDER BY RangeStartDate;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Brilliant.  Great solution

    I knew I'd come across this before, but "gaps and islands" are terms I'd forgotten , so I can now search properly for this type of problem.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply