Max Consecutive Dates Without an Exclusion Date

  • Here's one that's been nagging me.

    I've got two tables, with date "ranges." Table A is an "order" with a start / end range, the related table B is an exclusion date range table, which could have many exclusion ranges per "order."

    Table A

    OrderID, StartDate, EndDate

    14,'2013-09-13','2013-10-01'

    Table B

    OrderID, ExclusionStart, ExclusionEnd

    14, '2013-09-14','2013-09-15'

    14, '2013-09-21','2013-09-22'

    14, '2013-09-28','2013-09-29'

    14, '2013-09-25','2013-09-25'

    Sample setup:

    CREATE TABLE [dbo].[OrderSample](

    [OrderID] [int] ,

    [OrderStartDate] [datetime] ,

    [OrderEndDate] [datetime]

    )

    --## Create sample 'order'

    INSERT INTO OrderSample VALUES(14,'2013-09-01','2013-09-30')

    CREATE TABLE [dbo].[ExclusionSample](

    [OrderID] [int] ,

    [ExclusionStartDate] [datetime] ,

    [ExclusionEndDate] [datetime]

    )

    --## Create sample 'exclusion ranges' - my product does not work these days

    INSERT INTO ExclusionSample VALUES(14, '2013-09-14','2013-09-15')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-21','2013-09-22')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-28','2013-09-29')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-25','2013-09-25')

    What's the best way to get the MAX() consecutive valid days (uninterrupted by exclusion dates) per order?

    Desired Output:

    Table Output

    OrderID, MaxConsecValidDates

    My current solution is to use a UDF to explode both ranges and get a full list of dates per OrderID in separate "staging" tables, then filter out the excluded dates from a "ValidOrderDates" table... And run a CTE over the top of that to return the consecutive days.

    This is slow at the outset to set up, even with relatively low (300/400k rows in each table) but manageable incrementally going forward. Is there a bigger/better/faster approach? Am I overlooking a way to do this in a set-based manner?

    Thank you for looking it over!

  • I totally don't understand what the output you expect for this. Please provide the ACTUAL output you expect for your given sample. You may need to set up additional sample data (with expected outputs for each set) to fully demonstrate your requirements.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Apologies for the confusion, Kevin - thanks for having a look! In a hurry this morning, so I sketched it out pretty fast.

    I'll set up some supporting scripts for setup as soon as I have a minute, but the output I currently get looks like:

    OrderID | MaxConsecutiveDays

    14, 5

    Meaning that my required output returns, for each order, the maximum number of consecutive days which are uninterrupted by exclusion dates. In the quick sketch, I believe that would be five days for order 14, as the exclusion periods are weekends, other than one random Wednesday.

  • Given the data you posted, the expected output is 13 I hope, as there is no exclusion before the 14th.

    Anyway here is a query. I'm fairly sure it can be made more efficient:

    CREATE TABLE [dbo].[OrderSample](

    [OrderID] [int] ,

    [OrderStartDate] [datetime] ,

    [OrderEndDate] [datetime]

    )

    --## Create sample 'order'

    INSERT INTO OrderSample VALUES(14,'2013-09-01','2013-09-30')

    INSERT INTO OrderSample VALUES(16,'2013-08-01','2013-09-15')

    CREATE TABLE [dbo].[ExclusionSample](

    [OrderID] [int] ,

    [ExclusionStartDate] [datetime] ,

    [ExclusionEndDate] [datetime]

    )

    --## Create sample 'exclusion ranges' - my product does not work these days

    INSERT INTO ExclusionSample VALUES(14, '2013-09-14','2013-09-15')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-21','2013-09-22')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-28','2013-09-29')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-25','2013-09-25')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-14','2013-08-15')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-21','2013-08-22')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-28','2013-08-29')

    INSERT INTO ExclusionSample VALUES(16, '2013-09-14','2013-09-14')

    go

    WITH CTE1 AS (

    SELECT OS.OrderID, d.thedate,

    active = IIF(EXISTS (SELECT *

    FROM ExclusionSample ES

    WHERE OS.OrderID = ES.OrderID

    AND d.thedate BETWEEN ES.ExclusionStartDate AND ExclusionEndDate), 0, 1)

    FROM dates d

    JOIN OrderSample OS ON d.thedate BETWEEN OS.OrderStartDate AND OrderEndDate

    ), CTE2 AS (

    SELECT OrderID, active,

    SUM(active) OVER (PARTITION BY OrderID ORDER BY thedate

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cumactive,

    row_number() OVER (PARTITION BY OrderID ORDER BY thedate) AS rowno

    FROM CTE1

    ), CTE3 AS (

    SELECT OrderID, active, cumactive, rowno,

    denserank = dense_rank () OVER (PARTITION BY OrderID ORDER BY cumactive)

    FROM CTE2

    ), CTE4 AS (

    SELECT OrderID, rowno - denserank AS grp, COUNT(*) AS cnt

    FROM CTE3

    WHERE active = 1

    GROUP BY OrderID, rowno - denserank

    )

    SELECT OrderID, MAX(cnt)

    FROM CTE4

    GROUP BY OrderID

    go

    DROP TABLE OrderSample

    DROP TABLE ExclusionSample

    The query includes the table dates, here is a script (not this year's model) to populated it:

    -- Make sure it's empty.

    TRUNCATE TABLE dates

    go

    -- Get a temptable with numbers. This is a cheap, but not 100% reliable.

    -- Whence the query hint and all the checks.

    SELECT TOP 80001 n = IDENTITY(int, 0, 1)

    INTO #numbers

    FROM sysobjects o1

    CROSS JOIN sysobjects o2

    CROSS JOIN sysobjects o3

    CROSS JOIN sysobjects o4

    OPTION (MAXDOP 1)

    go

    -- Make sure we have unique numbers.

    CREATE UNIQUE CLUSTERED INDEX num_ix ON #numbers (n)

    go

    -- Verify that table does not have gaps.

    IF (SELECT COUNT(*) FROM #numbers) = 80001 AND

    (SELECT MIN(n) FROM #numbers) = 0 AND

    (SELECT MAX(n) FROM #numbers) = 80000

    BEGIN

    DECLARE @msg varchar(255)

    -- Insert the dates:

    INSERT dates (thedate)

    SELECT dateadd(DAY, n, '19800101')

    FROM #numbers

    WHERE dateadd(DAY, n, '19800101') < '21500101'

    SELECT @msg = 'Inserted ' + ltrim(str(@@rowcount)) + ' rows into #numbers'

    PRINT @msg

    END

    ELSE

    RAISERROR('#numbers is not contiguos from 0 to 80001!', 16, -1)

    go

    DROP TABLE #numbers

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • That's correct - I'd expect 13 as the result from the sample data.

    This looks like a more distilled, set based version of what I'm doing - exactly what I was after. I'm going to go through this and really check it out, see how it performs up against the larger sample set I have. Very much appreciated - thank you!!

    Erland Sommarskog (9/13/2013)


    Given the data you posted, the expected output is 13 I hope, as there is no exclusion before the 14th.

    Anyway here is a query. I'm fairly sure it can be made more efficient:

    CREATE TABLE [dbo].[OrderSample](

    [OrderID] [int] ,

    [OrderStartDate] [datetime] ,

    [OrderEndDate] [datetime]

    )

    --## Create sample 'order'

    INSERT INTO OrderSample VALUES(14,'2013-09-01','2013-09-30')

    INSERT INTO OrderSample VALUES(16,'2013-08-01','2013-09-15')

    CREATE TABLE [dbo].[ExclusionSample](

    [OrderID] [int] ,

    [ExclusionStartDate] [datetime] ,

    [ExclusionEndDate] [datetime]

    )

    --## Create sample 'exclusion ranges' - my product does not work these days

    INSERT INTO ExclusionSample VALUES(14, '2013-09-14','2013-09-15')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-21','2013-09-22')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-28','2013-09-29')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-25','2013-09-25')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-14','2013-08-15')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-21','2013-08-22')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-28','2013-08-29')

    INSERT INTO ExclusionSample VALUES(16, '2013-09-14','2013-09-14')

    go

    WITH CTE1 AS (

    SELECT OS.OrderID, d.thedate,

    active = IIF(EXISTS (SELECT *

    FROM ExclusionSample ES

    WHERE OS.OrderID = ES.OrderID

    AND d.thedate BETWEEN ES.ExclusionStartDate AND ExclusionEndDate), 0, 1)

    FROM dates d

    JOIN OrderSample OS ON d.thedate BETWEEN OS.OrderStartDate AND OrderEndDate

    ), CTE2 AS (

    SELECT OrderID, active,

    SUM(active) OVER (PARTITION BY OrderID ORDER BY thedate

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cumactive,

    row_number() OVER (PARTITION BY OrderID ORDER BY thedate) AS rowno

    FROM CTE1

    ), CTE3 AS (

    SELECT OrderID, active, cumactive, rowno,

    denserank = dense_rank () OVER (PARTITION BY OrderID ORDER BY cumactive)

    FROM CTE2

    ), CTE4 AS (

    SELECT OrderID, rowno - denserank AS grp, COUNT(*) AS cnt

    FROM CTE3

    WHERE active = 1

    GROUP BY OrderID, rowno - denserank

    )

    SELECT OrderID, MAX(cnt)

    FROM CTE4

    GROUP BY OrderID

    go

    DROP TABLE OrderSample

    DROP TABLE ExclusionSample

    The query includes the table dates, here is a script (not this year's model) to populated it:

    -- Make sure it's empty.

    TRUNCATE TABLE dates

    go

    -- Get a temptable with numbers. This is a cheap, but not 100% reliable.

    -- Whence the query hint and all the checks.

    SELECT TOP 80001 n = IDENTITY(int, 0, 1)

    INTO #numbers

    FROM sysobjects o1

    CROSS JOIN sysobjects o2

    CROSS JOIN sysobjects o3

    CROSS JOIN sysobjects o4

    OPTION (MAXDOP 1)

    go

    -- Make sure we have unique numbers.

    CREATE UNIQUE CLUSTERED INDEX num_ix ON #numbers (n)

    go

    -- Verify that table does not have gaps.

    IF (SELECT COUNT(*) FROM #numbers) = 80001 AND

    (SELECT MIN(n) FROM #numbers) = 0 AND

    (SELECT MAX(n) FROM #numbers) = 80000

    BEGIN

    DECLARE @msg varchar(255)

    -- Insert the dates:

    INSERT dates (thedate)

    SELECT dateadd(DAY, n, '19800101')

    FROM #numbers

    WHERE dateadd(DAY, n, '19800101') < '21500101'

    SELECT @msg = 'Inserted ' + ltrim(str(@@rowcount)) + ' rows into #numbers'

    PRINT @msg

    END

    ELSE

    RAISERROR('#numbers is not contiguos from 0 to 80001!', 16, -1)

    go

    DROP TABLE #numbers

  • I realised that CTE3 is not necessary, so the query can be simplified to:

    WITH CTE1 AS (

    SELECT OS.OrderID, d.thedate,

    active = IIF(EXISTS (SELECT *

    FROM ExclusionSample ES

    WHERE OS.OrderID = ES.OrderID

    AND d.thedate BETWEEN ES.ExclusionStartDate AND ExclusionEndDate), 0, 1)

    FROM bos_sommar..dates d

    JOIN OrderSample OS ON d.thedate BETWEEN OS.OrderStartDate AND OrderEndDate

    ), CTE2 AS (

    SELECT OrderID, active,

    SUM(active) OVER (PARTITION BY OrderID ORDER BY thedate

    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cumactive,

    row_number() OVER (PARTITION BY OrderID ORDER BY thedate) AS rowno

    FROM CTE1

    ), CTE3 AS (

    SELECT OrderID, rowno - cumactive AS grp, COUNT(*) AS cnt

    FROM CTE2

    WHERE active = 1

    GROUP BY OrderID, rowno - cumactive

    )

    SELECT OrderID, MAX(cnt)

    FROM CTE3

    GROUP BY OrderID

    This also removes a sort operation from the query plan.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Since you are using SS2012, here is a possible solution using the offset function LEAD.

    Supposing all exclusions are inside the order interval, the idea is to generate the intervals between exclusions, which is calculated as the [ExclusionEndDate] plus 1 as the next valid [OrderStartDate] and the next [ExclusionStartDate] minus one as the valid [OrderEndDate]. We do not have more exclusions after the last one so we will use the [OrderEndDate] from the [OrderSample] table.

    There is still pending how to generate the first valid range since we are looking forward on the exclusions so we are missing the range from the start date of the order till the start date of the first exclusion. Well, I tried combining the table [OrderSample] with the join of this table and the exclusions, then I use the LEAD to find next exclusion.

    Excuse me for the formatting because I do not know how to post T-SQL code. I'll ask Erland for help 🙂

    SET NOCOUNT ON;

    USE tempdb;

    GO

    CREATE TABLE [dbo].[OrderSample](

    [OrderID] [int] ,

    [OrderStartDate] [datetime] ,

    [OrderEndDate] [datetime]

    )

    --## Create sample 'order'

    INSERT INTO OrderSample VALUES(14,'2013-09-01','2013-09-30')

    INSERT INTO OrderSample VALUES(16,'2013-08-01','2013-09-15')

    CREATE TABLE [dbo].[ExclusionSample](

    [OrderID] [int] ,

    [ExclusionStartDate] [datetime] ,

    [ExclusionEndDate] [datetime]

    )

    --## Create sample 'exclusion ranges' - my product does not work these days

    INSERT INTO ExclusionSample VALUES(14, '2013-09-14','2013-09-15')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-21','2013-09-22')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-28','2013-09-29')

    INSERT INTO ExclusionSample VALUES(14, '2013-09-25','2013-09-25')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-14','2013-08-15')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-21','2013-08-22')

    INSERT INTO ExclusionSample VALUES(16, '2013-08-28','2013-08-29')

    INSERT INTO ExclusionSample VALUES(16, '2013-09-14','2013-09-14')

    GO

    WITH c1 AS (

    SELECT

    OrderID,

    OrderStartDate,

    OrderEndDate,

    NULL AS ExclusionStartDate,

    NULL AS ExclusionEndDate

    FROM

    dbo.OrderSample

    UNION ALL

    SELECT

    OS.OrderID,

    OS.OrderStartDate,

    OS.OrderEndDate,

    ES.ExclusionStartDate,

    ES.ExclusionEndDate

    FROM

    dbo.OrderSample AS OS

    INNER JOIN

    dbo.ExclusionSample AS ES

    ON ES.OrderID = OS.OrderID

    ),

    C2 AS (

    SELECT

    OrderID,

    DATEDIFF([day],

    ISNULL(DATEADD([day], 1, ExclusionEndDate), OrderStartDate),

    ISNULL(DATEADD([day], -1, LEAD(ExclusionStartDate) OVER(PARTITION BY OrderID ORDER BY ExclusionStartDate)), OrderEndDate)

    ) + 1 AS ConsecValidDates

    FROM

    C1

    )

    SELECT

    OrderID,

    MAX(ConsecValidDates) AS MaxConsecValidDates

    FROM

    C2

    GROUP BY

    OrderID;

    GO

    DROP TABLE dbo.OrderSample, dbo.ExclusionSample;

    GO

  • I believe that if you're exclusion date ranges do not overlap, you may also be able to do it like this:

    SELECT OrderID,[MaxNonExclusionDays]=MAX([NonExclusionDays])

    FROM (

    SELECT OrderID, GapStart=MIN(GapDates), GapEnd=MAX(GapDates)

    ,[NonExclusionDays]=1+DATEDIFF(day, MIN(GapDates), MAX(GapDates))

    FROM (

    SELECT OrderID, GapDates

    ,rn=(ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY GapDates)-1)/2

    FROM (

    SELECT OrderID, GapDates

    FROM OrderSample a

    CROSS APPLY (VALUES (OrderStartDate),(OrderEndDate)) b (GapDates)

    UNION ALL

    SELECT OrderID, GapDates

    FROM ExclusionSample b

    CROSS APPLY (

    VALUES (ExclusionStartDate-1), (ExclusionEndDate+1)) c(GapDates)

    ) a

    ) a

    GROUP BY OrderID, rn

    ) a

    GROUP BY OrderID;

    I've treated the exclusion days (+ order start/end dates) like a set of islands of date ranges, and then converted them to the gaps, ultimately calculating the MAX gap. Similar to (although not exactly like) what I did in this article: The SQL of Gaps and Islands in Sequences[/url]


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply