Intersecting Date Ranges

  • Hi,

    Today I was asked what the fastest way was to find out if a set of date ranges intersect.

    To clarify, for date ranges (DR1, DR2 , DR3);

    Where DR1 INTERSECTS DR2 AND DR2 INTERSECTS DR3

    But DR1 doen't INTERSECT DR3

    RETURN False.

    So, all DRn need to intersect/overlap for a positive.

    Posted below is what I came up with. I've not done any serious testing on it (so it might even lie to you :w00t:)

    But if you know of a better way, please let me know.

    And it goes without saying that you should use it freely if it works for you... or works at all 😀

    --1.Create table variable type

    CREATE TYPE DateRanges AS TABLE

    (

    FromDateDATE,

    ToDateDATE

    )

    GO

    --2.CREATE function

    CREATE FUNCTION dbo.isDatesOverlap

    (

    @DateRanges DateRanges READONLY

    )

    RETURNS BIT

    AS

    BEGIN

    DECLARE @OverLap BIT = 1

    ;WITH Ranges AS

    (

    SELECT RangeID= ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    ,FromDate

    ,ToDate

    FROM @DateRanges

    )

    SELECT

    @OverLap =

    CASE

    WHEN

    SUM(

    CASE

    WHENT2.FromDate BETWEEN T1.FromDate AND T1.ToDate OR

    T2.ToDate BETWEEN T1.FromDate AND T1.ToDate

    THEN 1 ELSE 0

    END

    ) = COUNT(*)

    THEN 1 ELSE 0

    END

    FROMRangesT1

    CROSS JOIN

    RangesT2

    WHERE T1.RangeID = 1 AND T2.RangeID > 1

    RETURN @OverLap

    END

    GO

    --3. Call example

    DECLARE @DateRanges DateRanges

    INSERT INTO @DateRanges (FromDate, ToDate)

    SELECT '2000-01-01', '2000-01-15' UNION ALL

    SELECT '2000-01-10', '2000-02-10' UNION ALL

    SELECT '2000-01-02', '2000-01-03'

    SELECT dbo.isDatesOverlap(@DateRanges)

    Looking forward to your feedback!

    Thank you.

  • i like the cross apply however you can turn it into an Inline Scalar Function (an iTVF that returns 1 value, Thanks Jeff http://www.sqlservercentral.com/articles/T-SQL/91724/) by using the following function.

    CREATE FUNCTION dbo.isDatesOverlap

    (

    @DateRanges DateRanges READONLY

    )

    RETURNS TABLE

    AS

    RETURN

    WITH Ranges AS

    (

    SELECT RangeID= ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    ,FromDate

    ,ToDate

    FROM @DateRanges

    )

    SELECT

    CASE

    WHEN

    SUM(

    CASE

    WHENT2.FromDate BETWEEN T1.FromDate AND T1.ToDate OR

    T2.ToDate BETWEEN T1.FromDate AND T1.ToDate

    THEN 1 ELSE 0

    END

    ) = COUNT(*)

    THEN 1 ELSE 0

    END AS IsContig

    FROMRangesT1

    CROSS JOIN

    RangesT2

    WHERE T1.RangeID = 1 AND T2.RangeID > 1

    GO

    the difference is that i just return the result of the select instead of setting your @Overlap. you may pick up a decent speed boost if you are running any sort of large numbers.

    as far as the actual code, the cross apply is how i would have gone as now you can pass in an unlimited amount of date pairs.

    the usage becomes SELECT * FROM dbo.isDatesOverlap(@DateRanges)

    instead of SELECT dbo.isDatesOverlap(@DateRanges)

    however i have a question on the requirements. Given the date ranges:

    INSERT INTO @DateRanges (FromDate, ToDate)

    SELECT '2000-01-01', '2000-01-15' UNION ALL

    SELECT '2000-01-10', '2000-02-10' UNION ALL

    SELECT '2000-01-17', '2000-01-19' -- does only overlaps one range not both

    should the results be a 0


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • Very cool, thank you for the pointers and the link to Jeff's article capn.hector!

    Will run some tests in the morning 🙂

  • diamondgm (8/2/2012)


    Very cool, thank you for the pointers and the link to Jeff's article capn.hector!

    Will run some tests in the morning 🙂

    @Capn; Sorry I missed your question in the post.

    All ranges need a common intersect. If DR1 overlaps DR2 and DR3 overlaps DR2 but not DR1 then 0.

    Also, I've run some (probably very primitive) tests and the results seem pretty much neck and neck between the functions you and I have posted.

    When I have more time, I'd like to see what can be done with CROSS APPLY

    I've got to get going to work so I'm just going to post the mess;

    INSERT INTO [Sandbox].[dbo].[DateRanges]

    ([Date1]

    ,[Date2])

    SELECT CAST(CAST(RAND(CAST(CAST(NEWID() AS VARBINARY)AS INT)) * 10000 AS DATETIME) AS DATE)

    ,CAST(CAST(RAND(CAST(CAST(NEWID() AS VARBINARY)AS INT)) * 10000 AS DATETIME) AS DATE)

    GO 300000

    DELETE FROM dbo.DateRanges

    WHERE Date1>= Date2

    SELECT COUNT(*)

    FROM DateRanges

    DECLARE @DateRanges DateRanges,

    @Start DATETIME

    INSERT INTO @DateRanges (FromDate, ToDate)

    SELECT * FROM dbo.DateRanges --ORDER BY NEWID()

    SET @Start = GETDATE()

    SELECT dbo.isDatesOverlap(@DateRanges)

    SELECT DATEDIFF(MS, @Start, GETDATE())

    SET @Start = GETDATE()

    SELECT * FROM dbo.isDatesOverlap2(@DateRanges)

    SELECT DATEDIFF(MS, @Start, GETDATE())

  • CELKO (8/2/2012)


    Did you know that ANSI/ISO Standard SQL has an OVERLAPS() predicate?

    @joe; I didn't know that. Will look around for info.

    What arguments does it accepts?

  • diamondgm (8/3/2012)


    diamondgm (8/2/2012)


    Very cool, thank you for the pointers and the link to Jeff's article capn.hector!

    Will run some tests in the morning 🙂

    @Capn; Sorry I missed your question in the post.

    All ranges need a common intersect. If DR1 overlaps DR2 and DR3 overlaps DR2 but not DR1 then 0.

    Also, I've run some (probably very primitive) tests and the results seem pretty much neck and neck between the functions you and I have posted.

    When I have more time, I'd like to see what can be done with CROSS APPLY

    I've got to get going to work so I'm just going to post the mess;

    the problem with testing this is that in order to really get a feel you need large volumes of data and good random data may always yield a 0. to really test you need to set up a test bed as similar to your production environment as possible.

    if only 3 or 4 values are ever going to be passed to this function you will have a very hard time finding any difference between the two and then it becomes a matter of which way you want to call the function (Select dbo.function() or SELECT * FROM dbo.function()) in the Jeff's write up he was running over 1 million rows of data the differences become clear but if you run the same tests on a small record set (a size of 101 lets say) here is what happens to the results:

    -------------------------------------------------

    Direct N*2 - Time: 0

    --------------------------------------------------------------

    Standard Scalar UDF N*2 - Time: 0

    -----------------------------------------------------------------

    Inline Scalar Function N*2 - Time: 0

    As you can see if the input set is small enough even a very slow method that will not scale well will not have enough iterations to actually show the difference. its also the reason to test any thing on production size data to make sure that it will scale.

    Here is the test bed i used (All of the function code is from Jeff Moden's Spackle article table and columns changed to match the test bed)

    WITH e1(N) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) x(N)),

    e2(N) AS (Select 0 UNION ALL

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM e1 a, e1 b)

    SELECT N INTO #TempTally FROM e2

    DECLARE @Result INT

    DECLARE @Start DATETIME

    SELECT @Start = GETDATE()

    SELECT @Result = N * 2

    FROM #TempTally

    SELECT 'Direct N*2 - Time: ' + CAST(DATEDIFF(ms,@Start,GETDATE()) AS VARCHAR)

    SELECT @Start = GETDATE()

    SELECT @Result = dbo.TimesTwo(N)

    FROM #TempTally

    SELECT 'Standard Scalar UDF N*2 - Time: ' + CAST(DATEDIFF(ms,@Start,GETDATE()) AS VARCHAR)

    SELECT @Start = GETDATE()

    SELECT @Result = ca.Doubled

    FROM #TempTally

    CROSS APPLY dbo.TimesTwoiSF(N) ca

    SELECT 'Inline Scalar Function N*2 - Time: ' + CAST(DATEDIFF(ms,@Start,GETDATE()) AS VARCHAR)

    References: http://www.sqlservercentral.com/articles/T-SQL/91724/


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • As Joe alluded to, there is a much simpler method of calculating intersections. The standard uses only two dates, but it easily be modified to account for multiple dates.

    SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting

    FROM @DateRanges

    Here is the comparison data.

    CROSS JOIN

    Table '#2E8BB1C1'. Scan count 2, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    GROUP BY

    Table '#2E8BB1C1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    As you can see, the GROUP BY only scans the table once, whereas the CROSS JOIN scans the table twice. As Capn said, the difference is negligible for small tables.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (8/3/2012)


    As Joe alluded to, there is a much simpler method of calculating intersections. The standard uses only two dates, but it easily be modified to account for multiple dates.

    SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting

    FROM @DateRanges

    Perhaps I'm being obtuse, but how would this help with the problem as posed in my code?

    CROSS JOIN

    Table '#2E8BB1C1'. Scan count 2, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    GROUP BY

    Table '#2E8BB1C1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    As you can see, the GROUP BY only scans the table once, whereas the CROSS JOIN scans the table twice. As Capn said, the difference is negligible for small tables.

    Would you mind showing me how you would use this? I'm at a bit of a loss.

  • FromDate ToDate

    '2000-01-01' '2000-01-15'

    '2000-01-10' '2000-02-10'

    '2000-01-02' '2000-01-03'

    Looking at the values above, you see the the Bolded value is greater than the value in italics. All three of these ranges do not overlap each other.

    FromDate ToDate

    '2000-01-01' '2000-01-15'

    '2000-01-10' '2000-02-10'

    '2000-01-02' '2000-01-10'

    Now looking at them, you see that there is an overlap.

  • @Lynn.Pettis & @drew.allen

    I can't believe I didn't understand it when drew.allen posted it.

    Thank you for clarifying 😀

    @joe

    Thank you for the explanation!

  • drew.allen (8/3/2012)


    As Joe alluded to, there is a much simpler method of calculating intersections. The standard uses only two dates, but it easily be modified to account for multiple dates.

    SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting

    FROM @DateRanges

    Here is the comparison data.

    CROSS JOIN

    Table '#2E8BB1C1'. Scan count 2, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    GROUP BY

    Table '#2E8BB1C1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    As you can see, the GROUP BY only scans the table once, whereas the CROSS JOIN scans the table twice. As Capn said, the difference is negligible for small tables.

    Drew

    Gosh, Drew. I'm just not seeing how that works. In the following example where all of the dates are overlapping, we still get a zero.

    -- drop table #testtable

    SELECT FromDate = CAST(d.FromDate AS DATETIME),

    ToDate = CAST(d.ToDate AS DATETIME)

    INTO #TestTable

    FROM (

    SELECT '2002-01-01', '2001-01-01' UNION ALL

    SELECT '2002-01-01', '2003-01-01' UNION ALL

    SELECT '2002-01-01', '2005-01-01'

    ) d (FromDate,ToDate)

    ;

    SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting

    FROM #TestTable

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (8/4/2012)


    drew.allen (8/3/2012)


    As Joe alluded to, there is a much simpler method of calculating intersections. The standard uses only two dates, but it easily be modified to account for multiple dates.

    SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting

    FROM @DateRanges

    Here is the comparison data.

    CROSS JOIN

    Table '#2E8BB1C1'. Scan count 2, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    GROUP BY

    Table '#2E8BB1C1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    As you can see, the GROUP BY only scans the table once, whereas the CROSS JOIN scans the table twice. As Capn said, the difference is negligible for small tables.

    Drew

    Gosh, Drew. I'm just not seeing how that works. In the following example where all of the dates are overlapping, we still get a zero.

    -- drop table #testtable

    SELECT FromDate = CAST(d.FromDate AS DATETIME),

    ToDate = CAST(d.ToDate AS DATETIME)

    INTO #TestTable

    FROM (

    SELECT '2002-01-01', '2001-01-01' UNION ALL

    SELECT '2002-01-01', '2003-01-01' UNION ALL

    SELECT '2002-01-01', '2005-01-01'

    ) d (FromDate,ToDate)

    ;

    SELECT CASE WHEN Max(FromDate) <= Min(ToDate) THEN 1 ELSE 0 END AS Intersecting

    FROM #TestTable

    Jeff, look at your first date range, the to date is less than the from date.

  • Lynn Pettis (8/4/2012)


    Jeff, look at your first date range, the to date is less than the from date.

    {facepalm} :blush:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply