January 19, 2010 at 5:26 am
Hi,
I'd like to put a solution to problem up for discussion.
My question is, is there a better way (than the below) to identify Start / Stop times for contiguous blocks of time (associated with some entity).
(Note the start stop times here concern different columns in the same row, I've seen solutions for fundamentally different problems concerning contiguous periods for a single column across multiple rows).
I came up with a solution that I believe was correct, but would accomplish a result in 5 joins (it involved finding starts for which no previous row overlapped, finding ends for which no subsequent row overlapped, and then joining the result).
I abandoned this approach, believing there was a more condensed solution, and found my may to a publication by Snodgrass ("Developing Time-Oriented Database Applications in SQL", available here http://www.cs.arizona.edu/~rts/publications.html as a pdf) that on p.165 provided a solution (which I believe provides the solution to the problem I'm looking at)
Consider the table:-
CREATE TABLE dbo.TableWithTimePeriod
(
Id int IDENTITY(1,1) NOT NULL,
EntityWithStartStopTimes nvarchar(10) NOT NULL,
StartTime datetime NOT NULL,
StopTime datetime NOT NULL -- Only considering cases when stop times exist for start times
);
Find the start and stop times marking the boundaries of each overlapping block of time (for each entity), where no 'previous' and no 'subsequent' pairs of start stop times overlap these boundaries (if you catch my drift):-
WITH SequencedByEntityAndStart
AS(SELECTEntityWithStartStopTimes,
StartTime,
StopTime,
ROW_NUMBER() OVER(ORDER BY EntityWithStartStopTimes, StartTime) AS RowNum
FROM dbo.TableWithTimePeriod
),
EntityStartWithAllAppropriateStops
AS (
SELECTF.EntityWithStartStopTimes,
F.StartTime,
L.StopTime,
-- Put these in optionally to help see what is happening
ROW_NUMBER() OVER(PARTITION BYF.EntityWithStartStopTimes,
F.StartTime
ORDER BYL.StopTime
) AS GroupedRowNum,
ROW_NUMBER() OVER(ORDER BYF.EntityWithStartStopTimes,
F.StartTime,
L.StopTime
) AS RoWNum
FROMSequencedByEntityAndStart AS F,
SequencedByEntityAndStart AS L,
SequencedByEntityAndStart AS E
WHERE F.StopTime <= L.StopTime
AND F.EntityWithStartStopTimes = L.EntityWithStartStopTimes
AND F.EntityWithStartStopTimes = E.EntityWithStartStopTimes
GROUP BYF.EntityWithStartStopTimes,
F.StartTime,
L.StopTime
HAVING COUNT(CASE
WHEN (E.StartTime < F.StartTime
AND F.StartTime <= E.StopTime
)
OR (E.StartTime <= L.StopTime
AND L.StopTime < E.StopTime
)
THEN 1
END
) =0
)
SELECTEntityWithStartStopTimes,
StartTime,
MIN(StopTime) AS EndTimeOfUse
FROM EntityStartWithAllAppropriateStops
GROUP BY EntityWithStartStopTimes,
StartTime
ORDER BYEntityWithStartStopTimes,
StartTime;
I wish I could claim all credit (or any) for the above, it appears an (extremely) elegant solution. I've put in some ROW_NUMBER functions (drawing on Mr. Ben-Gan's excellent T-SQL Querying (2005) pp.260-262) to help identify blocks for visual inspection (if playing about with the query).
To recap, is there a more elegant solution (or even a correct solution for me if the above is not) using SQL 2005 constructs?
Obviously over large tables, the above query is costly, so a WHILE loop scanning through blocks of EntityWithStartStopTimes will help.
Comments/ suggestions/ links (/even help) are most welcome.
Thank you 🙂
Viewing post 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply