SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


How to Compare Rows within Partitioned Sets to Find Overlapping Dates


How to Compare Rows within Partitioned Sets to Find Overlapping Dates

Author
Message
kevin.wu
kevin.wu
SSC Veteran
SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)

Group: General Forum Members
Points: 233 Visits: 337
David McKinney (10/11/2010)
Shouldn't you have a LEFT JOIN rather than an INNER JOIN?


You're right! In haste, I ran an INNER JOIN so that I could just see the compared rows, but in hindsight, I probably should have kept my base table pure and ran a LEFT JOIN instead. This at least would have allowed me to keep my table counts the same as well as singling out the last row in each partitioned set.
kevin.wu
kevin.wu
SSC Veteran
SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)SSC Veteran (233 reputation)

Group: General Forum Members
Points: 233 Visits: 337
ta.bu.shi.da.yu (10/11/2010)
If I read this correctly, you just implemented the LAG(...) Over(...) analytic function.


Cool, I didn't even know this existed. But it looks like it's only available in Oracle databases. Microsoft, please bring this over!

However, I'm interested... if you do this on a large dataset, what does the execution plan look like?


Let me try to get my execution plan and I'll get back to you. I do remember that our initial process ran painfully slow and we had just 45K rows. It ran for something like 45 seconds (almost 1 sec/1K rows) using a scalar function. This updated process now takes less than one second to complete.
k_t_Schmidt
k_t_Schmidt
SSC Veteran
SSC Veteran (257 reputation)SSC Veteran (257 reputation)SSC Veteran (257 reputation)SSC Veteran (257 reputation)SSC Veteran (257 reputation)SSC Veteran (257 reputation)SSC Veteran (257 reputation)SSC Veteran (257 reputation)

Group: General Forum Members
Points: 257 Visits: 125
Real World Example: Wage and Hour Class Action Lawsuits. I am always working with dates on employment cases and many times the individual has many start and end dates. Part of the scrubbing process is to check for date overlaps or gaps. This is a good start to check those types of things.

Thanks for the post! Smile
dwain.c
dwain.c
One Orange Chip
One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)

Group: General Forum Members
Points: 28541 Visits: 6431
SQL 2012 makes pretty short work of this type of problem:


WITH PartitionedSchedules AS
(
SELECT ScheduleID, PersonID, startDate, durationDays
,CalculatedEndDate=DATEADD(day, durationDays, startDate)
,row2startDate=LEAD(startDate, 1) OVER (PARTITION BY PersonID ORDER BY startDate)
FROM Schedules
)
SELECT ScheduleID, PersonID, startDate, durationDays, CalculatedEndDate
,row2startDate
,datedifference
,analysis=CASE SIGN(datedifference)
WHEN 0 THEN 'contiguous'
WHEN 1 THEN CAST(ABS(datedifference) AS VARCHAR) + ' days overlap'
ELSE CAST(ABS(datedifference) AS VARCHAR) + ' days gap'
END

FROM PartitionedSchedules a
CROSS APPLY
(
SELECT DATEDIFF(day, row2startDate, CalculatedEndDate)
) b (datedifference)
WHERE datedifference IS NOT NULL;




No more need for a self-join.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
dwain.c
dwain.c
One Orange Chip
One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)One Orange Chip (28K reputation)

Group: General Forum Members
Points: 28541 Visits: 6431
And here is (I think) another way to do it in SQL 2005 that avoids the self-join:


SELECT ScheduleID, PersonID, startDate, durationDays
,row2StartDate, CalculatedEndDate, datedifference
,analysis=CASE SIGN(datedifference)
WHEN 0 THEN 'contiguous'
WHEN 1 THEN CAST(ABS(datedifference) AS VARCHAR) + ' days overlap'
ELSE CAST(ABS(datedifference) AS VARCHAR) + ' days gap'
END
FROM
(
SELECT ScheduleID=MAX(CASE WHEN rn2 = 2 THEN ScheduleID END)
,PersonID
,startDate=MIN(startDate)
,durationDays=MAX(CASE WHEN rn2 = 2 THEN durationDays END)
,row2StartDate=MAX(CASE rn2 WHEN 2 THEN CalculatedEndDate ELSE [Date] END)
,CalculatedEndDate=MAX(CASE rn2 WHEN 2 THEN [Date] END)
,datedifference=DATEDIFF(day
,MAX(CASE rn2 WHEN 2 THEN CalculatedEndDate ELSE [Date] END)
,MAX(CASE rn2 WHEN 2 THEN [Date] END))
FROM
(
SELECT ScheduleID
,PersonID
,startDate
,durationDays
,CalculatedEndDate=CASE WHEN rn2=1 THEN DATEADD(day, durationDays, [Date]) END
,[Date]
,rn=ROW_NUMBER() OVER (PARTITION BY PersonID ORDER BY startDate)/2
,rn2
FROM Schedules a
CROSS APPLY
(
SELECT 1, startDate UNION ALL
SELECT 2, DATEADD(day, durationDays, startDate)
) b (rn2, [Date])
) a
GROUP BY PersonID, rn
HAVING COUNT(*) = 2
) a
ORDER BY PersonID;




This one assumes though that a row does not overlap two or more following rows.

The benefit of course of not doing a self-join is that the query does a single table or index scan (depending on indexing) instead of two.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search