I want to select non-overlapping date ranges, which share a common value. I have prepared the following to illustrate my problem
DROP TABLE IF EXISTS #SampleRanges;
CREATE TABLE #SampleRanges (
RowNum INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
SampleDate DATETIME NOT NULL,
SampleValue BIGINT NOT NULL);
INSERT INTO #SampleRanges (SampleDate, SampleValue)
SELECT '2018-08-30 05:07:17', 31267 UNION ALL
SELECT '2018-08-30 05:07:18', 31267 UNION ALL
SELECT '2018-08-30 05:12:35', 31267 UNION ALL
SELECT '2018-08-30 07:10:10', 134159490 UNION ALL
SELECT '2018-08-30 07:13:54', 134159490 UNION ALL
SELECT '2018-08-30 07:17:19', 134159490 UNION ALL
SELECT '2018-08-30 07:22:16', 134159490 UNION ALL
SELECT '2018-08-30 07:25:42', 134159490 UNION ALL
SELECT '2018-08-30 07:28:03', 64498 UNION ALL
SELECT '2018-08-30 07:28:07', 64498 UNION ALL
SELECT '2018-08-30 08:16:14', 64498 UNION ALL
SELECT '2018-08-30 08:16:37', 134159490 UNION ALL
SELECT '2018-08-30 08:20:27', 134159490 UNION ALL
SELECT '2018-08-30 08:27:53', 460350737 UNION ALL
SELECT '2018-08-30 08:28:04', 15375002 UNION ALL
SELECT '2018-08-30 08:28:05', 15375002 UNION ALL
SELECT '2018-08-30 08:28:22', 134159490;
-- DESIRED OUTPUT
/*
RangeStartDate, RangeEndDate, SampleValue, ValueCount
'2018-08-30 05:07:17','2018-08-30 05:12:35', 31267, 3
'2018-08-30 07:10:10','2018-08-30 07:25:42', 134159490, 5
'2018-08-30 07:28:03','2018-08-30 08:16:14', 64498, 2
'2018-08-30 08:16:37','2018-08-30 08:20:27', 134159490, 2
'2018-08-30 08:27:53','2018-08-30 08:27:53', 460350737, 1
'2018-08-30 08:28:04','2018-08-30 08:28:05',15375002, 2
'2018-08-30 08:28:22','2018-08-30 08:28:22', 134159490, 1
*/
-- what I have tried
SELECT MIN(S1.SampleDate) AS RangeStartDate, MAX(S1.SampleDate) AS RangeEndDate, S1.SampleValue, COUNT(S1.SampleValue) AS ValueCount
FROM #SampleRanges AS S1
-- LEFT OUTER JOIN #SampleRanges AS S2 ON S1.SampleValue = S2.SampleValue AND S1.RowNum+1 = S2.RowNum
GROUP BY S1.SampleValue
ORDER BY RangeStartDate
-- This is nearly there, but produces a single range for Value 134159490, which overlaps other ranges, instead of 3 non-overlapping ranges for thate Value.
-- I know I have to somehow JOIN the table to itself, but the JOIN I have written makes no difference
This is a gaps and islands problem
WITH Src AS (
SELECT SampleDate, SampleValue,
ROW_NUMBER() OVER(ORDER BY SampleDate) -
ROW_NUMBER() OVER(PARTITION BY SampleValue ORDER BY SampleDate) AS grp
FROM #SampleRanges
)
SELECT MIN(SampleDate) AS RangeStartDate,
MAX(SampleDate) AS RangeEndDate,
SampleValue,
COUNT(*) AS ValueCount
FROM Src
GROUP BY grp,SampleValue
ORDER BY RangeStartDate;
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537September 3, 2019 at 3:26 pm
Brilliant. Great solution
I knew I'd come across this before, but "gaps and islands" are terms I'd forgotten , so I can now search properly for this type of problem.
September 4, 2019 at 9:04 am
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy