Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

How to Compare Rows within Partitioned Sets to Find Overlapping Dates Expand / Collapse
Author
Message
Posted Monday, October 11, 2010 10:13 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Saturday, May 3, 2014 11:52 PM
Points: 9, Visits: 327
David McKinney (10/11/2010)
Shouldn't you have a LEFT JOIN rather than an INNER JOIN?


You're right! In haste, I ran an INNER JOIN so that I could just see the compared rows, but in hindsight, I probably should have kept my base table pure and ran a LEFT JOIN instead. This at least would have allowed me to keep my table counts the same as well as singling out the last row in each partitioned set.
Post #1002309
Posted Monday, October 11, 2010 10:25 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Saturday, May 3, 2014 11:52 PM
Points: 9, Visits: 327
ta.bu.shi.da.yu (10/11/2010)
If I read this correctly, you just implemented the LAG(...) Over(...) analytic function.


Cool, I didn't even know this existed. But it looks like it's only available in Oracle databases. Microsoft, please bring this over!

However, I'm interested... if you do this on a large dataset, what does the execution plan look like?


Let me try to get my execution plan and I'll get back to you. I do remember that our initial process ran painfully slow and we had just 45K rows. It ran for something like 45 seconds (almost 1 sec/1K rows) using a scalar function. This updated process now takes less than one second to complete.
Post #1002320
Posted Tuesday, October 12, 2010 3:53 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Wednesday, April 9, 2014 7:47 AM
Points: 19, Visits: 125
Real World Example: Wage and Hour Class Action Lawsuits. I am always working with dates on employment cases and many times the individual has many start and end dates. Part of the scrubbing process is to check for date overlaps or gaps. This is a good start to check those types of things.

Thanks for the post! :)
Post #1003261
Posted Tuesday, December 24, 2013 11:02 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Wednesday, August 20, 2014 4:14 AM
Points: 3,618, Visits: 5,254
SQL 2012 makes pretty short work of this type of problem:

WITH PartitionedSchedules AS
(
SELECT ScheduleID, PersonID, startDate, durationDays
,CalculatedEndDate=DATEADD(day, durationDays, startDate)
,row2startDate=LEAD(startDate, 1) OVER (PARTITION BY PersonID ORDER BY startDate)
FROM Schedules
)
SELECT ScheduleID, PersonID, startDate, durationDays, CalculatedEndDate
,row2startDate
,datedifference
,analysis=CASE SIGN(datedifference)
WHEN 0 THEN 'contiguous'
WHEN 1 THEN CAST(ABS(datedifference) AS VARCHAR) + ' days overlap'
ELSE CAST(ABS(datedifference) AS VARCHAR) + ' days gap'
END

FROM PartitionedSchedules a
CROSS APPLY
(
SELECT DATEDIFF(day, row2startDate, CalculatedEndDate)
) b (datedifference)
WHERE datedifference IS NOT NULL;


No more need for a self-join.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1525854
Posted Wednesday, December 25, 2013 12:08 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Wednesday, August 20, 2014 4:14 AM
Points: 3,618, Visits: 5,254
And here is (I think) another way to do it in SQL 2005 that avoids the self-join:

SELECT ScheduleID, PersonID, startDate, durationDays
,row2StartDate, CalculatedEndDate, datedifference
,analysis=CASE SIGN(datedifference)
WHEN 0 THEN 'contiguous'
WHEN 1 THEN CAST(ABS(datedifference) AS VARCHAR) + ' days overlap'
ELSE CAST(ABS(datedifference) AS VARCHAR) + ' days gap'
END
FROM
(
SELECT ScheduleID=MAX(CASE WHEN rn2 = 2 THEN ScheduleID END)
,PersonID
,startDate=MIN(startDate)
,durationDays=MAX(CASE WHEN rn2 = 2 THEN durationDays END)
,row2StartDate=MAX(CASE rn2 WHEN 2 THEN CalculatedEndDate ELSE [Date] END)
,CalculatedEndDate=MAX(CASE rn2 WHEN 2 THEN [Date] END)
,datedifference=DATEDIFF(day
,MAX(CASE rn2 WHEN 2 THEN CalculatedEndDate ELSE [Date] END)
,MAX(CASE rn2 WHEN 2 THEN [Date] END))
FROM
(
SELECT ScheduleID
,PersonID
,startDate
,durationDays
,CalculatedEndDate=CASE WHEN rn2=1 THEN DATEADD(day, durationDays, [Date]) END
,[Date]
,rn=ROW_NUMBER() OVER (PARTITION BY PersonID ORDER BY startDate)/2
,rn2
FROM Schedules a
CROSS APPLY
(
SELECT 1, startDate UNION ALL
SELECT 2, DATEADD(day, durationDays, startDate)
) b (rn2, [Date])
) a
GROUP BY PersonID, rn
HAVING COUNT(*) = 2
) a
ORDER BY PersonID;


This one assumes though that a row does not overlap two or more following rows.

The benefit of course of not doing a self-join is that the query does a single table or index scan (depending on indexing) instead of two.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1525875
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse