Is this a "gaps and islands" problem? Finding gaps in overlapping times.

  • The following can be pasted into SSMS so that it is both more readable, and easier get your teeth into the data.

    /* System: SQL 2005

    Problem: I have a table of data representing when people stayed in certain locations.

    Its fine for multiple people to stay in the one location at the same time. People can arrive

    and leave independently of one another. I need to find the "gaps" in the data where there was

    nobody at a given location. A person can't be in two locations at once. Here's

    some sample data, but the real data will be over a million rows long and the query will run

    many times a day.

    */

    --======= Create a test table

    IF OBJECT_ID('tempdb..#stays') IS NOT NULL

    DROP TABLE #stays

    ;

    CREATE TABLE #stays(transaction_id int identity(1,1)

    ,location_id int not null

    ,person_id int not null

    ,start_dt datetime not null

    ,end_dt datetime null

    CONSTRAINT [PK_#stay] PRIMARY KEY NONCLUSTERED (transaction_id ASC)

    )

    ;

    GO

    --======= Add a unique constraint

    CREATE UNIQUE CLUSTERED INDEX ix_person_id_start_dt ON #stays (person_id,start_dt);

    --======= Add some sample data

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (1,1,'20130701 12:38','20130703 14:19');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (1,2,'20130702 01:32','20130704 05:20');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (1,7,'20130703 04:32','20130705 08:23');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (1,3,'20130707 14:35','20130710 09:39');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (1,4,'20130708 11:30','20130709 19:13');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (2,5,'20130701 16:31','20130702 14:31');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (2,6,'20130703 13:37','20130705 22:27');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (2,1,'20130708 19:32','20130711 04:56');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (2,2,'20130715 18:38','20130718 14:44');

    INSERT #stays(location_id,person_id,start_dt,end_dt) VALUES (3,3,'20130712 01:32','20130712 21:19');

    GO

    /* The activity could be displayed graphically thus:

    Date |-1--|-2--|-3--|-4-|-5--|-6--|-7--|-8--|-9--|-10-|-11-|-12-|-13-|-14-|-15-|-16-|-17-|-18-|

    Loc 1 Psn 1 ***|---------| ********** ******************************************

    Loc 1 Psn 2 *** |----------| ********** ******************************************

    Loc 1 Psn 7 *** |---------|********** ******************************************

    Loc 1 Psn 3 *** **********|--------------|******************************************

    Loc 1 Psn 4 *** ********** |-----| ******************************************

    ______________________________________________________________________________________________________

    Loc 2 Psn 5 **|-----|*** *************** ********************* **

    Loc 2 Psn 6 ** ***|---------|*************** ********************* **

    Loc 2 Psn 1 ** *** |-----------|********************* **

    Loc 2 Psn 2 ** *** *************** *********************|--------------|**

    ______________________________________________________________________________________________________

    Loc 3 Psn 3 *******************************************************|--|*******************************

    */

    --======= The asterisks above represent the gaps at each location (times when it was unoccupied).

    --======= I want a final dataset as follows:

    /*

    location_idunoccupied_start_dtunoccupied_end_dttransaction_type

    1 NULL 2013-07-01 12:38 unoccupied

    1 2013-07-05 08:232013-07-07 14:35unoccupied

    1 2013-07-10 09:39NULL unoccupied

    2 NULL 2013-07-01 16:31 unoccupied

    2 2013-07-02 14:312013-07-03 13:37unoccupied

    2 2013-07-05 22:272013-07-08 19:32unoccupied

    2 2013-07-11 04:562013-07-15 18:38unoccupied

    2 2013-07-18 14:44NULL unoccupied

    3 NULL 2013-07-12 01:32 unoccupied

    3 2013-07-12 21:19NULL unoccupied

    */

    --======= This is no good when you get overlapping people at one location.

    WITH ordered_starts as

    (

    SELECT transaction_id

    ,stay.location_id

    ,stay.person_id

    ,stay.start_dt

    ,stay.end_dt

    ,ROW_NUMBER() OVER(PARTITION BY stay.location_id

    ORDER BY stay.start_dt) as seq

    FROM #stays stay

    )

    SELECT prev.location_id

    ,prev.end_dt as unoccupied_start_dt

    ,nxt.start_dt as unoccupied_end_dt

    ,'unoccupied' as transaction_type

    FROM ordered_starts prev

    LEFT JOIN ordered_starts nxt ON prev.location_id = nxt.location_id

    and prev.seq = nxt.seq - 1

    ORDER BY prev.location_id

    ,prev.start_dt

    ;

    So the obvious question is "How do I identify the gaps where the locations are unoccupied?

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • There's a little date arithmetic left for you in this one:

    SELECT

    location_id,

    unoccupied_start_dt = CASE WHEN seq = 1 THEN NULL ELSE unoccupied_start_dt END,

    unoccupied_end_dt = CASE WHEN seq = [Rows] THEN NULL ELSE unoccupied_end_dt END

    FROM (

    SELECT

    location_id,

    unoccupied_start_dt = MIN(Timespot),

    unoccupied_end_dt = MAX(Timespot),

    seq = ROW_NUMBER() OVER(PARTITION BY location_id ORDER BY TimeGroup),

    [Rows] = COUNT(*) OVER(PARTITION BY location_id)

    FROM (

    SELECT

    s.location_id, s.MIN_start_dt, s.MAX_end_dt,

    x.Timespot,

    TimeGroup = DATEADD(minute,1-ROW_NUMBER() OVER(PARTITION BY s.location_id ORDER BY x.Timespot), x.Timespot)

    FROM (SELECT location_id, MIN_start_dt = MIN(start_dt), MAX_end_dt = MAX(end_dt) FROM #stays GROUP BY location_id) s

    CROSS APPLY (

    SELECT TOP (DATEDIFF(minute,s.MIN_start_dt,s.MAX_end_dt)+3)

    TimeSpot = DATEADD(minute,-2 + ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),s.MIN_start_dt)

    FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n)

    ) x

    WHERE NOT EXISTS (SELECT 1 FROM #stays l WHERE l.location_id = s.location_id

    AND x.Timespot BETWEEN l.start_dt AND l.end_dt)

    ) d

    GROUP BY location_id, TimeGroup

    ) o

    I'm willing to bet DwainC has a better solution than this, he's just had a Gaps'n'Islands paper published over on Simple Talk.

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks Chris. I'll go away and digest it. The VALUES... syntax doesn't work in SQL 2005, but it looks suspiciously like a tally table (which I already have) so I might be able to work around it.

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • OK. So I'm trying to understand table value constructors and cross apply. If I do this (on a SQL 2008 machine):

    SELECT *

    FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n)

    ...I get 100,000 rows and 5 columns. All the columns are called n and all the values are zero. What does this do? Why do I need 5 identical columns?

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • ChrisM@Work (8/14/2013)


    I'm willing to bet DwainC has a better solution than this, he's just had a Gaps'n'Islands paper published over on Simple Talk.

    You just had to call me out on this one didn't you?

    I must rise to the challenge.

    -- Islands method by Itzik Ben-Gan

    -- http://www.solidq.com/sqj/Pages/2011-March-Issue/Packing-Intervals.aspx

    WITH C1 AS (

    SELECT location_id, ts, Type

    ,e=CASE Type WHEN 1 THEN NULL ELSE ROW_NUMBER() OVER (PARTITION BY location_id, Type ORDER BY end_dt) END

    ,s=CASE Type WHEN -1 THEN NULL ELSE ROW_NUMBER() OVER (PARTITION BY location_id, Type ORDER BY start_dt) END

    FROM -- #stays

    --CROSS APPLY (

    -- VALUES (1, start_dt), (-1, end_dt)) a(Type, ts)

    (

    SELECT location_id, [Type]=1, ts=start_dt, start_dt, end_dt

    FROM #stays

    UNION ALL

    SELECT location_id, -1, end_dt, start_dt, end_dt

    FROM #stays

    ) a

    ),

    C2 AS (

    SELECT C1.*

    ,se=ROW_NUMBER() OVER (PARTITION BY location_id ORDER BY ts, Type DESC)

    FROM C1),

    C3 AS (

    SELECT location_id, ts

    ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY location_id ORDER BY ts)-1) / 2 + 1)

    FROM C2

    WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0),

    C4 AS (

    SELECT location_id, StartDate=MIN(ts), EndDate=MAX(ts)

    FROM C3

    GROUP BY location_id, grpnm)

    -- Convert the Islands from C4 into gaps (method by Dwain.C)

    -- https://www.simple-talk.com/sql/t-sql-programming/the-sql-of-gaps-and-islands-in-sequences/

    SELECT location_id, StartDate=MIN([date]), EndDate=MAX([date])

    FROM (

    SELECT location_id, [date]

    ,rn=ROW_NUMBER() OVER (PARTITION BY location_id ORDER BY [date])/2

    FROM (

    -- Normally this would be a CROSS APPLY VALUES but this works in SQL 2005

    SELECT location_id, [date]=StartDate

    FROM C4

    UNION ALL

    SELECT location_id, [date]=EndDate

    FROM C4

    ) a

    ) a

    GROUP BY location_id, rn

    HAVING COUNT(*) = 2

    ;

    Edit: Ooops! Forgot to change Mr. Ben-Gan's CROSS APPLY VALUES to something compatible with SQL 2005.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • GPO (8/14/2013)


    OK. So I'm trying to understand table value constructors and cross apply. If I do this (on a SQL 2008 machine):

    SELECT *

    FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n)

    ...I get 100,000 rows and 5 columns. All the columns are called n and all the values are zero. What does this do? Why do I need 5 identical columns?

    Try it like this:

    SELECT n=ROW_NUMBER() OVER (ORDER BY (SELECT $))

    FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n)

    And you find it generates a 100,000 row tally table.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • GPO (8/14/2013)


    OK. So I'm trying to understand table value constructors and cross apply. If I do this (on a SQL 2008 machine):

    SELECT *

    FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n)

    ...I get 100,000 rows and 5 columns. All the columns are called n and all the values are zero. What does this do? Why do I need 5 identical columns?

    Going just a little deeper...

    You're not actually supposed to use any data from that. You're only supposed to use the presence of rows instead of writing a loop (and it will blow the doors off of any explicit loop or recursive CTE). R. Barry Young coined a phase for such a thing as "Pseudo Cursor".

    If you modify the SELECT list just a bit, you can see that it's a readless Tally Table. Adding TOP(some#) (which I didn't do below) to the SELECT will allow you to control the limit of the 1 to N domain.

    SELECT N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) b (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d (n)

    CROSS JOIN (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) e (n)

    Try the same thing with a While Loop or Recursive CTE and compare the resource usage and duration to the above.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff doesn't like it when I use

    SELECT $

    In my tally tables. :hehe: Just having a little fun there.

    BTW. I noticed you also wanted the open-ended gaps before start and after end. From your comments in the code you posted:

    --======= I want a final dataset as follows:

    /*

    location_idunoccupied_start_dtunoccupied_end_dttransaction_type

    1 NULL 2013-07-01 12:38 unoccupied

    1 2013-07-05 08:232013-07-07 14:35unoccupied

    1 2013-07-10 09:39NULL unoccupied

    2 NULL 2013-07-01 16:31 unoccupied

    2 2013-07-02 14:312013-07-03 13:37unoccupied

    2 2013-07-05 22:272013-07-08 19:32unoccupied

    2 2013-07-11 04:562013-07-15 18:38unoccupied

    2 2013-07-18 14:44NULL unoccupied

    3 NULL 2013-07-12 01:32 unoccupied

    3 2013-07-12 21:19NULL unoccupied

    Those are easy enough to add by tacking the following onto the end of the code I posted (after the HAVING clause).

    UNION ALL

    SELECT location_id, NULL, MIN(start_dt)

    FROM #stays

    GROUP BY location_id

    UNION ALL

    SELECT location_id, MAX(end_dt), NULL

    FROM #stays

    GROUP BY location_id

    ORDER BY location_id, StartDate, EndDate

    I got so focused on the gaps thingy that I missed that little wrinkle on first scan.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Awesome to have the heavy hitters on the case! Jeff, where you say:

    You're not actually supposed to use any data from that. You're only supposed to use the presence of rows instead of writing a loop

    I understand that, but if you run that it returns 5 columns of zeros. All called [n]. And I can't work out why that's necessary. Surely one column of zeros would be enough.

    Edit: I think I get it now! The cross joins mean that 10^5 rows are generated, and you're saying it's irrelevant that 5 meaningless columns just happen to be generated. Doing an extra CROSS JOIN would presumably result in 6 meaningless columns and a million rows. Does that sound right?

    Edit 2:

    a readless Tally Table

    So that means a tally table that doesn't have to be read from the disk, and is presumably therefore faster than the permanent tally table I usually use. How am I going so far?

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • GPO (8/15/2013)


    Awesome to have the heavy hitters on the case! Jeff, where you say:

    You're not actually supposed to use any data from that. You're only supposed to use the presence of rows instead of writing a loop

    I understand that, but if you run that it returns 5 columns of zeros. All called [n]. And I can't work out why that's necessary. Surely one column of zeros would be enough.

    Edit: I think I get it now! The cross joins mean that 10^5 rows are generated, and you're saying it's irrelevant that 5 meaningless columns just happen to be generated. Doing an extra CROSS JOIN would presumably result in 6 meaningless columns and a million rows. Does that sound right?

    I'll try to answer.

    The CROSS JOINs are building a Cartesian product of rows 10 at a time. 10x10x10x10x10

    Each CROSS JOIN adds an additional column to the result set because the table returned by the table row constructor must have a named column in it. They could be named anything - Chris just chose to name each one n.

    However, since all of the values in each of the 5 columns is zero, that isn't what you need. What you need is the ROW_NUMBER, which is your Tally table.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • GPO (8/15/2013)


    Edit: I think I get it now! The cross joins mean that 10^5 rows are generated, and you're saying it's irrelevant that 5 meaningless columns just happen to be generated. Doing an extra CROSS JOIN would presumably result in 6 meaningless columns and a million rows. Does that sound right?

    Edit 2:

    a readless Tally Table

    So that means a tally table that doesn't have to be read from the disk, and is presumably therefore faster than the permanent tally table I usually use. How am I going so far?

    Seems that your edit was on the mark.

    Here's a link you can look at that is comparing in-line vs. disk tally tables. The result I'd say is that it depends.

    http://www.sqlservercentral.com/scripts/tally/100338/

    I believe the comparison code is in an attached resource file.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Dwain

    I'm enormously grateful for the code you've put up. I'll test yours and Chris's and see what I can learn from them. I'll post back my observations after some time for reflection...(he said clinging for dear life to the learning curve)

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

  • GPO (8/15/2013)


    Hi Dwain

    I'm enormously grateful for the code you've put up. I'll test yours and Chris's and see what I can learn from them. I'll post back my observations after some time for reflection...(he said clinging for dear life to the learning curve)

    I'd be interested to know how the 2 compare on performance. I'm not so sure mine will stand up, primarily because of the adaptations I had to make to be SQL 2005 compatible.

    Hopefully we'll see.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • GPO (8/15/2013)


    Hi Dwain

    I'm enormously grateful for the code you've put up. I'll test yours and Chris's and see what I can learn from them. I'll post back my observations after some time for reflection...(he said clinging for dear life to the learning curve)

    Dwain's quite capable of providing an American English description of how his code works. Here's an English description of mine ๐Ÿ˜€

    For each LocationID, find the earliest and latest date in the set. Subtract one interval from the earliest and add one interval to the latest. An interval for this exercise is defined as one minute.

    Generate a row for each interval between these two dates - a set of dates incrementing by one minute from the start date (minus a minute) to the end date (plus a minute).

    Remove rows from the list which are between the start date and end date of any visit for the locationID. This will leave a date range with gaps in it, where the gaps correspond to visits.

    Divine the start and end date of each contiguous date range remaining.

    Finally, process the start and end date to generate the NULLs shown in your example.

    Nice easy query to finish off a busy day with ๐Ÿ˜‰

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris how would you change yours to work on SQL 2005 (no cross apply and table value constructors)?

    ...One of the symptoms of an approaching nervous breakdown is the belief that ones work is terribly important.... Bertrand Russell

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply