Date Gap Problem

  • My brain is mush and I just cannot work this one out. I need a good solution to find any gaps between enddate and startdate > 30 days for the period 2006-01-01 and 2009-12-31. The data is a bit crappy, with overlapping periods. here is a representative sample, and I think only member 3 should qualify as having a 30 day gap in coverage. Thanks for your help!!

    create table #membership (memberid int, startdate datetime, enddate datetime)

    --clean data, no overlaps, no > 30 day gap

    insert #membership values (1,'2005-10-01','2006-09-30')

    insert #membership values (1,'2006-10-01','2007-12-31')

    insert #membership values (1,'2008-01-15','2008-12-31') --only 15 day gap here

    insert #membership values (1,'2009-01-01','9999-12-31')

    insert #membership values (2,'2005-10-01','2006-09-30')

    insert #membership values (2,'2006-10-01','2007-09-30') --note overlap

    insert #membership values (2,'2007-02-03','2007-12-31')

    insert #membership values (2,'2008-01-15','2008-12-31') --only 15 day gap here

    insert #membership values (2,'2009-01-01','9999-12-31')

    insert #membership values (3,'2005-10-01','2006-09-30')

    insert #membership values (3,'2006-10-01','2007-09-30')

    --> 30 day gap here

    insert #membership values (3,'2008-01-01','2008-12-31')

    insert #membership values (3,'2009-01-01','9999-12-31')

    insert #membership values (4,'2005-10-01','2006-09-30')

    insert #membership values (4,'2006-10-01','2007-09-30')

    --> 30 day gap

    insert #membership values (4,'2008-01-01','2008-12-31')

    insert #membership values (4,'2009-01-01','9999-12-31')

    --but this record covers above gap

    insert #membership values (4,'2003-01-01','9999-12-31')

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • How about using ROW_NUMBER(PARTITION BY Member_ID, ORDER BY EndDate)

    Then self join on ROWNumber = Rownumber + 1 and member_id= member_id

    where dateadd(D, 30, tbl1.enddate) <= tbl2.stardate

    I'm sure you'll figure out the fast way to do this... might I suggest persisted indexed column on startdate - 30 or something similar?

  • Ninja's_RGR'us (10/17/2010)


    How about using ROW_NUMBER(PARTITION BY Member_ID, ORDER BY EndDate)

    Then self join on ROWNumber = Rownumber + 1 and member_id= member_id

    where dateadd(D, 30, tbl1.enddate) <= tbl2.stardate

    I'm sure you'll figure out the fast way to do this... might I suggest persisted indexed column on startdate - 30 or something similar?

    I gave that a thought, but I think both the overlap and the overarching date range mess things up. Will play around with it, but not tonight - my brain is totally fried, and wife just got home from w/e away. Thanks for the suggestion - more tomorrow!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (10/17/2010)


    My brain is mush and I just cannot work this one out. I need a good solution to find any gaps between enddate and startdate > 30 days for the period 2006-01-01 and 2009-12-31. The data is a bit crappy, with overlapping periods. here is a representative sample, and I think only member 3 should qualify as having a 30 day gap in coverage. Thanks for your help!!

    create table #membership (memberid int, startdate datetime, enddate datetime)

    --clean data, no overlaps, no > 30 day gap

    insert #membership values (1,'2005-10-01','2006-09-30')

    insert #membership values (1,'2006-10-01','2007-12-31')

    insert #membership values (1,'2008-01-15','2008-12-31') --only 15 day gap here

    insert #membership values (1,'2009-01-01','9999-12-31')

    insert #membership values (2,'2005-10-01','2006-09-30')

    insert #membership values (2,'2006-10-01','2007-09-30') --note overlap

    insert #membership values (2,'2007-02-03','2007-12-31')

    insert #membership values (2,'2008-01-15','2008-12-31') --only 15 day gap here

    insert #membership values (2,'2009-01-01','9999-12-31')

    insert #membership values (3,'2005-10-01','2006-09-30')

    insert #membership values (3,'2006-10-01','2007-09-30')

    --> 30 day gap here

    insert #membership values (3,'2008-01-01','2008-12-31')

    insert #membership values (3,'2009-01-01','9999-12-31')

    insert #membership values (4,'2005-10-01','2006-09-30')

    insert #membership values (4,'2006-10-01','2007-09-30')

    --> 30 day gap

    insert #membership values (4,'2008-01-01','2008-12-31')

    insert #membership values (4,'2009-01-01','9999-12-31')

    --but this record covers above gap

    insert #membership values (4,'2003-01-01','9999-12-31')

    The following does the trick and will do a million rows in just a couple of seconds. As usual, the details are in the comments.

    --===== Assuming the original table cannot be modified, create a new table

    -- with a new column and transfer the data all on the fly.

    SELECT MemberID = ISNULL(MemberID,0),

    StartDate = ISNULL(StartDate,0),

    EndDate = ISNULL(EndDate,0),

    Gap = CAST(NULL AS INT)

    INTO #Work

    FROM #Membership;

    --===== Add the quintessential clustered index.

    -- Note we don't name it because constraints must be unique in the DB.

    ALTER TABLE #Work

    ADD PRIMARY KEY CLUSTERED (MemberID, StartDate, EndDate);

    --===== Create some working variables and preset them

    DECLARE @RowNum BIGINT,

    @PrevMemberID INT,

    @PrevEndDate DATETIME,

    @gap INT; --This is a "dummy" variable to make it all work

    SELECT @RowNum = 1;

    --===== Do the update to calculate the Gap on the fly

    WITH

    cteEnumerate AS

    (

    SELECT RowNum = ROW_NUMBER() OVER (ORDER BY MemberID, StartDate, EndDate),

    MemberID,

    StartDate,

    EndDate,

    Gap

    FROM #Work

    )

    UPDATE tgt

    SET @gap = Gap

    = CASE

    WHEN RowNum = @RowNum --Forces error if gets out of sync

    THEN CASE

    WHEN MemberID = @PrevMemberID

    AND DATEADD(dd,-1,StartDate) <= ISNULL(@PrevEndDate,StartDate)

    THEN 0

    WHEN MemberID <> ISNULL(@PrevMemberID,0)

    THEN 0

    ELSE DATEDIFF(dd,@PrevEndDate,StartDate)

    END

    ELSE 1/0 --Forces error if gets out of sync

    END,

    @RowNum = @RowNum + 1,

    @PrevEndDate = CASE -- Keep the largest EndDate for any given member

    WHEN MemberID = @PrevMemberID

    AND @PrevEndDate > EndDate

    THEN @PrevEndDate

    ELSE EndDate

    END,

    @PrevMemberID = MemberID

    FROM cteEnumerate tgt WITH (TABLOCKX) --Not required but does save some time

    OPTION (MAXDOP 1); --Absolutely required. We cannot allow parallelism for this type of update.

    --===== Let's see the result

    SELECT * FROM #Work

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Sorry... forgot to print out the results...

    MemberIDStartDateEndDateGap

    12005-10-01 00:00:00.0002006-09-30 00:00:00.0000

    12006-10-01 00:00:00.0002007-12-31 00:00:00.0000

    12008-01-15 00:00:00.0002008-12-31 00:00:00.00015

    12009-01-01 00:00:00.0009999-12-31 00:00:00.0000

    22005-10-01 00:00:00.0002006-09-30 00:00:00.0000

    22006-10-01 00:00:00.0002007-09-30 00:00:00.0000

    22007-02-03 00:00:00.0002007-12-31 00:00:00.0000

    22008-01-15 00:00:00.0002008-12-31 00:00:00.00015

    22009-01-01 00:00:00.0009999-12-31 00:00:00.0000

    32005-10-01 00:00:00.0002006-09-30 00:00:00.0000

    32006-10-01 00:00:00.0002007-09-30 00:00:00.0000

    32008-01-01 00:00:00.0002008-12-31 00:00:00.00093

    32009-01-01 00:00:00.0009999-12-31 00:00:00.0000

    42003-01-01 00:00:00.0009999-12-31 00:00:00.0000

    42005-10-01 00:00:00.0002006-09-30 00:00:00.0000

    42006-10-01 00:00:00.0002007-09-30 00:00:00.0000

    42008-01-01 00:00:00.0002008-12-31 00:00:00.0000

    42009-01-01 00:00:00.0009999-12-31 00:00:00.0000

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Kevin,

    This works, but the performance is atrocious. I'm posting it so that others (Jeff???) can see what I'm doing, and to improve upon it.

    declare @GapDays int,

    @StartDate datetime,

    @EndDate datetime;

    select @GapDays = 30,

    @StartDate = '20060101',

    @EndDate = '20091231';

    ;WITH

    --TENS (N) AS (SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL

    -- SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL

    -- SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0),

    --THOUSANDS (N) AS (SELECT 1 FROM TENS t1 CROSS JOIN TENS t2 CROSS JOIN TENS t3),

    --MILLIONS (N) AS (SELECT 1 FROM THOUSANDS t1 CROSS JOIN THOUSANDS t2),

    --TALLY (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM MILLIONS),

    DATES AS

    (

    -- get all of the dates between @StartDate and @EndDate, including those values.

    -- if you don't have a permanent tally table, uncomment the lines above and

    -- remove the dbo schema-qualifier on the tally table to use a virtual tally table.

    -- the TOP clause restricts to just the dates that we are interested in seeing.

    SELECT TOP (DateDiff(day, @StartDate, @EndDate)+1)

    N,

    MyDate = DateAdd(day, N-1, @StartDate)

    FROM dbo.TALLY

    ),

    MemberDates AS

    (

    -- Get all of the dates for each memberid

    SELECT m.memberid, ds.MyDate

    FROM #membership m

    CROSS APPLY(SELECT MyDate FROM DATES WHERE MyDate BETWEEN m.StartDate and m.EndDate) ds

    ),

    GAP_RANGE AS

    (

    -- Get the gaps for each memberid

    SELECT memberid,

    GapStart = (SELECT DateAdd(day, 1, ISNULL(MAX(lo.MyDate),0))

    FROM MemberDates lo

    WHERE lo.MyDate < hi.MyDate

    AND lo.memberid = hi.memberid),

    GapEnd = DateAdd(day, -1, hi.MyDate)

    FROM MemberDates hi

    WHERE hi.MyDate NOT IN (SELECT DateAdd(day, 1, MyDate) FROM MemberDates WHERE memberid = hi.memberid)

    )

    -- show the gaps per memberid, along with what the gap is

    -- only show gaps >= specified value.

    SELECT *,

    Gap = DateDiff(day, GapStart, GapEnd)+1

    FROM GAP_RANGE

    WHERE GapEnd > @StartDate

    AND DateDiff(day, GapStart, GapEnd)+1 >= @GapDays

    Edit: I see Jeff and I were posting at the same time

    Edit2: here are my results:

    memberid GapStart GapEnd Gap

    ----------- ----------------------- ----------------------- -----------

    3 2007-10-01 00:00:00.000 2007-12-31 00:00:00.000 92

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Jeff Moden (10/17/2010)


    Sorry... forgot to print out the results...

    MemberIDStartDateEndDateGap

    12006-10-01 00:00:00.0002007-12-31 00:00:00.0000

    12008-01-15 00:00:00.0002008-12-31 00:00:00.00015

    22007-02-03 00:00:00.0002007-12-31 00:00:00.0000

    22008-01-15 00:00:00.0002008-12-31 00:00:00.00015

    32006-10-01 00:00:00.0002007-09-30 00:00:00.0000

    32008-01-01 00:00:00.0002008-12-31 00:00:00.00093

    Jeff, are these gaps correct? By my calc, there are off (high) by one day.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (10/17/2010)


    This works, but the performance is atrocious.

    Because of the very large datespan from any date in this century to 9999-12-31, I'm thinking that my ol' friend the Tally Table just isn't the way to go on this one. Quirky Update with a safety check will do the trick nicely.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/17/2010)


    WayneS (10/17/2010)


    This works, but the performance is atrocious.

    Because of the very large datespan from any date in this century to 9999-12-31, I'm thinking that my ol' friend the Tally Table just isn't the way to go on this one. Quirky Update with a safety check will do the trick nicely.

    Check out the TOP clause I'm using with the tally table... I'm restricting it to just the 1462 dates involved in this.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (10/17/2010)


    Jeff Moden (10/17/2010)


    Sorry... forgot to print out the results...

    MemberIDStartDateEndDateGap

    12006-10-01 00:00:00.0002007-12-31 00:00:00.0000

    12008-01-15 00:00:00.0002008-12-31 00:00:00.00015

    22007-02-03 00:00:00.0002007-12-31 00:00:00.0000

    22008-01-15 00:00:00.0002008-12-31 00:00:00.00015

    32006-10-01 00:00:00.0002007-09-30 00:00:00.0000

    32008-01-01 00:00:00.0002008-12-31 00:00:00.00093

    Jeff, are these gaps correct? By my calc, there are off (high) by one day.

    I agree... but Kevin called them 15 day gaps so I was going with the flow. 😀 Adding a strategic "-1" does the trick...

    drop table #membership, #Work

    create table #membership (memberid int, startdate datetime, enddate datetime)

    --clean data, no overlaps, no > 30 day gap

    insert #membership values (1,'2005-10-01','2006-09-30')

    insert #membership values (1,'2006-10-01','2007-12-31')

    insert #membership values (1,'2008-01-15','2008-12-31') --only 15 day gap here

    insert #membership values (1,'2009-01-01','9999-12-31')

    insert #membership values (2,'2005-10-01','2006-09-30')

    insert #membership values (2,'2006-10-01','2007-09-30') --note overlap

    insert #membership values (2,'2007-02-03','2007-12-31')

    insert #membership values (2,'2008-01-15','2008-12-31') --only 15 day gap here

    insert #membership values (2,'2009-01-01','9999-12-31')

    insert #membership values (3,'2005-10-01','2006-09-30')

    insert #membership values (3,'2006-10-01','2007-09-30')

    --> 30 day gap here

    insert #membership values (3,'2008-01-01','2008-12-31')

    insert #membership values (3,'2009-01-01','9999-12-31')

    insert #membership values (4,'2005-10-01','2006-09-30')

    insert #membership values (4,'2006-10-01','2007-09-30')

    --> 30 day gap

    insert #membership values (4,'2008-01-01','2008-12-31')

    insert #membership values (4,'2009-01-01','9999-12-31')

    --but this record covers above gap

    insert #membership values (4,'2003-01-01','9999-12-31')

    --===== Assuming the original table cannot be modified, create a new table

    -- with a new column and transfer the data all on the fly.

    SELECT MemberID = ISNULL(MemberID,0),

    StartDate = ISNULL(StartDate,0),

    EndDate = ISNULL(EndDate,0),

    Gap = CAST(NULL AS INT)

    INTO #Work

    FROM #Membership;

    --===== Add the quintessential clustered index.

    -- Note we don't name it because constraints must be unique in the DB.

    ALTER TABLE #Work

    ADD PRIMARY KEY CLUSTERED (MemberID, StartDate, EndDate);

    --===== Creat some working variables and preset them

    DECLARE @RowNum BIGINT,

    @PrevMemberID INT,

    @PrevEndDate DATETIME,

    @gap INT; --This is a "dummy" variable to make it all work

    SELECT @RowNum = 1;

    --===== Do the update to calculate the Gap on the fly

    WITH

    cteEnumerate AS

    (

    SELECT RowNum = ROW_NUMBER() OVER (ORDER BY MemberID, StartDate, EndDate),

    MemberID,

    StartDate,

    EndDate,

    Gap

    FROM #Work

    )

    UPDATE tgt

    SET @gap = Gap

    = CASE

    WHEN RowNum = @RowNum --Forces error if gets out of sync

    THEN CASE

    WHEN MemberID = @PrevMemberID

    AND DATEADD(dd,-1,StartDate) <= ISNULL(@PrevEndDate,StartDate)

    THEN 0

    WHEN MemberID <> ISNULL(@PrevMemberID,0)

    THEN 0

    ELSE DATEDIFF(dd,@PrevEndDate,StartDate)-1

    END

    ELSE 1/0 --Forces error if gets out of sync

    END,

    @RowNum = @RowNum + 1,

    @PrevEndDate = CASE -- Keep the largest EndDate for any given member

    WHEN MemberID = @PrevMemberID

    AND @PrevEndDate > EndDate

    THEN @PrevEndDate

    ELSE EndDate

    END,

    @PrevMemberID = MemberID

    FROM cteEnumerate tgt WITH (TABLOCKX) --Not required but does save some time

    OPTION (MAXDOP 1); --Absolutely required. We cannot allow parallelism for this type of update.

    --===== Let's see the result

    SELECT * FROM #Work

    ... and that gives the following results...

    MemberID StartDate EndDate Gap

    ----------- ----------------------- ----------------------- -----------

    1 2005-10-01 00:00:00.000 2006-09-30 00:00:00.000 0

    1 2006-10-01 00:00:00.000 2007-12-31 00:00:00.000 0

    1 2008-01-15 00:00:00.000 2008-12-31 00:00:00.000 14

    1 2009-01-01 00:00:00.000 9999-12-31 00:00:00.000 0

    2 2005-10-01 00:00:00.000 2006-09-30 00:00:00.000 0

    2 2006-10-01 00:00:00.000 2007-09-30 00:00:00.000 0

    2 2007-02-03 00:00:00.000 2007-12-31 00:00:00.000 0

    2 2008-01-15 00:00:00.000 2008-12-31 00:00:00.000 14

    2 2009-01-01 00:00:00.000 9999-12-31 00:00:00.000 0

    3 2005-10-01 00:00:00.000 2006-09-30 00:00:00.000 0

    3 2006-10-01 00:00:00.000 2007-09-30 00:00:00.000 0

    3 2008-01-01 00:00:00.000 2008-12-31 00:00:00.000 92

    3 2009-01-01 00:00:00.000 9999-12-31 00:00:00.000 0

    4 2003-01-01 00:00:00.000 9999-12-31 00:00:00.000 0

    4 2005-10-01 00:00:00.000 2006-09-30 00:00:00.000 0

    4 2006-10-01 00:00:00.000 2007-09-30 00:00:00.000 0

    4 2008-01-01 00:00:00.000 2008-12-31 00:00:00.000 0

    4 2009-01-01 00:00:00.000 9999-12-31 00:00:00.000 0

    (18 row(s) affected)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • WayneS (10/17/2010)


    Jeff Moden (10/17/2010)


    WayneS (10/17/2010)


    This works, but the performance is atrocious.

    Because of the very large datespan from any date in this century to 9999-12-31, I'm thinking that my ol' friend the Tally Table just isn't the way to go on this one. Quirky Update with a safety check will do the trick nicely.

    Check out the TOP clause I'm using with the tally table... I'm restricting it to just the 1462 dates involved in this.

    Sorry... I missed that. It still causes scans on internal tables that are more than 26K rows deep for about the same reason. Take a look at the actual execution plan.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here's an update to my code. I've moved the 1462 dates we're concerned with into a separate temp table, instead of being called through a CTE numerous times. Greatly improves the performance, but the QU will still be faster.

    declare @GapDays int,

    @StartDate datetime,

    @EndDate datetime;

    select @GapDays = 30,

    @StartDate = '20060101',

    @EndDate = '20091231';

    if object_id('tempdb..#Dates') IS NOT NULL DROP TABLE #Dates;

    CREATE TABLE #Dates (N INT, MyDate datetime PRIMARY KEY CLUSTERED);

    --;WITH

    --TENS (N) AS (SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL

    -- SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL

    -- SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0),

    --THOUSANDS (N) AS (SELECT 1 FROM TENS t1 CROSS JOIN TENS t2 CROSS JOIN TENS t3),

    --MILLIONS (N) AS (SELECT 1 FROM THOUSANDS t1 CROSS JOIN THOUSANDS t2),

    --TALLY (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM MILLIONS)

    -- get all of the dates between @StartDate and @EndDate, including those values.

    -- the TOP clause restricts to just the dates that we are interested in seeing.

    -- if you don't have a permanent tally table, un-comment the lines above, and

    -- remove the dbo schema-qualifier from the tally table below.

    INSERT INTO #Dates

    SELECT TOP (DateDiff(day, @StartDate, @EndDate)+1)

    N,

    MyDate = DateAdd(day, N-1, @StartDate)

    FROM dbo.TALLY

    ;WITH MemberDates AS

    (

    -- Get all of the dates for each memberid

    SELECT m.memberid, ds.MyDate

    FROM #membership m

    CROSS APPLY(SELECT MyDate FROM #DATES WHERE MyDate BETWEEN m.StartDate and m.EndDate) ds

    ),

    GAP_RANGE AS

    (

    -- Get the gaps for each memberid

    SELECT memberid,

    GapStart = (SELECT DateAdd(day, 1, ISNULL(MAX(lo.MyDate),0))

    FROM MemberDates lo

    WHERE lo.MyDate < hi.MyDate

    AND lo.memberid = hi.memberid),

    GapEnd = DateAdd(day, -1, hi.MyDate)

    FROM MemberDates hi

    WHERE hi.MyDate NOT IN (SELECT DateAdd(day, 1, MyDate) FROM MemberDates WHERE memberid = hi.memberid)

    )

    -- show the gaps per memberid, along with what the gap is

    -- only show gaps >= specified value.

    SELECT *,

    Gap = DateDiff(day, GapStart, GapEnd)+1

    FROM GAP_RANGE

    WHERE GapEnd > @StartDate

    AND DateDiff(day, GapStart, GapEnd)+1 >= @GapDays

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • A slightly different approach,

    uses this technique http://sqlblogcasts.com/blogs/sqlandthelike/archive/2009/08/27/sql-and-contiguous-data-ranges.aspx to build a new non-overlapping list of ranges first.

    I would be interested in comparative performance over a larger dataset. 😉

    Drop Table #NewRanges

    go

    Create Table #NewRanges

    (

    MemberId int,

    StartDate datetime,

    EndDate datetime,

    RowN integer

    )

    go

    with cteDateFix

    as

    (

    select memberid,startdate,enddate as origend,

    case when enddate ='9999-12-31'

    then max(case when enddate='9999-12-31' then startdate else enddate end) over (partition by memberid) else enddate end as enddate

    from #membership

    ),

    cteDayCalc

    as

    (

    select memberid,startdate,enddate,DATEDIFF(dd,startdate,enddate) as DayDiff

    from cteDateFix

    ),

    cteExplode

    as

    (

    select memberid,startdate,enddate,DayDiff,startdate+number as RangeDay from cteDaycalc

    join master.dbo.spt_values

    on spt_values.number <= DayDiff

    where TYPE='p'

    ),

    cteRanking

    as

    (

    select *,DENSE_RANK() over (partition by memberid order by RangeDay Desc) As RankDesc

    from cteExplode

    ),

    cteGrouping

    as

    (

    Select memberid,RankDesc,RangeDay,RangeDay+RankDesc as GroupingDay

    from cteRanking

    ),

    cteNewDateRanges

    as

    (

    Select memberid,MIN(RangeDay) as StartDate,MAX(RangeDay) as EndDate

    from cteGrouping

    group by memberid,GroupingDay

    )

    insert into #NewRanges(MemberId,StartDate,EndDate,RowN)

    Select memberid,StartDate,EndDate,ROW_NUMBER() over (partition by MemberId order by startDate) as Rown

    from cteNewDateRanges

    Select RangeOn.MemberId,RangeOn.StartDate,RangeOn.EndDate,DATEDIFF(dd,rangeon.EndDate,RangeNext.StartDate)-1 as Gap

    from #NewRanges RangeOn

    left join #NewRanges RangeNext

    on RangeNext.RowN = RangeOn.RowN+1

    and RangeNext.MemberId = RangeOn.MemberId

    order by 1,2



    Clear Sky SQL
    My Blog[/url]

  • The original question doesn't give expected output or the number of rows to be processed, so this is a valid alternative:

    SELECT *

    FROM #membership M

    WHERE M.enddate >= '2006-01-01'

    AND M.enddate != '9999-12-31'

    AND M.startdate < '2010-01-01'

    AND NOT EXISTS

    (

    SELECT *

    FROM #membership M2

    WHERE M2.memberid = M.memberid

    AND M2.startdate <= DATEADD(DAY, 30, M.enddate)

    AND M2.enddate > M.enddate

    );

    Paul

  • Dave Ballantyne (10/18/2010)


    I would be interested in comparative performance over a larger dataset. 😉

    No problem. Here's some test data...

    --===== Expand the test data using "lasagne" copies of the original data.

    -- You could use a recursive CTE here... if you like millions of reads. ;-)

    DECLARE @Offset INT,

    @Msg NVARCHAR(30);

    SELECT @Offset = 2;

    WHILE @@ROWCOUNT <= 1000000

    BEGIN

    SELECT @Offset = @Offset*2,

    @Msg = 'Offset = '+CAST(@Offset AS NVARCHAR(10))+'.';

    RAISERROR (@Msg,10,1) WITH NOWAIT;

    INSERT INTO #membership

    (memberid, startdate, enddate)

    SELECT memberid = memberid+@Offset, startdate, enddate

    FROM #membership;

    END;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply