TSQL: Grouping continuous timeslots together

  • Dear  Communitiy,

    hope you can help me solving this issue:

    I have to group continuous timeslots together:

    Example:

    DECLARE @test-2 as Table (ID int, tFrom datetime, tUntil dateTime)

    insert into @test-2 Values (1,'2019-1-1 12:00', '2019-1-1 13:00')

    insert into @test-2 Values (1,'2019-1-1 13:00', '2019-1-1 14:00')

    insert into @test-2 Values (1,'2019-1-1 14:00', '2019-1-1 16:00')

    insert into @test-2 Values (1,'2019-1-1 18:00', '2019-1-1 19:00')

    insert into @test-2 Values (1,'2019-1-1 19:00', '2019-1-1 20:00')

    insert into @test-2 Values (1,'2019-1-1 20:00', '2019-1-1 21:00')

    insert into @test-2 Values (1,'2019-1-1 22:00', '2019-1-1 23:00')

    insert into @test-2 Values (2,'2019-1-1 12:00', '2019-1-1 13:00')

    insert into @test-2 Values (2,'2019-1-1 13:00', '2019-1-1 14:00')

    insert into @test-2 Values (2,'2019-1-1 14:00', '2019-1-1 16:00')

    insert into @test-2 Values (2,'2019-1-1 18:00', '2019-1-1 19:00')

    insert into @test-2 Values (2,'2019-1-1 19:00', '2019-1-1 20:00')

    insert into @test-2 Values (2,'2019-1-1 20:00', '2019-1-1 21:00')

    insert into @test-2 Values (2,'2019-1-1 22:00', '2019-1-1 23:00')

    Expected result:

    1; 2019-1-1 12:00; 2019-1-1 16:00

    1; 2019-1-1 18:00; 2019-1-1 21:00

    1; 2019-1-1 22:00; 2019-1-1 23:00

    2; 2019-1-1 12:00; 2019-1-1 16:00

    2; 2019-1-1 18:00; 2019-1-1 21:00

    2; 2019-1-1 22:00; 2019-1-1 23:00

    Would you so great if you can help me solving this :-)! Thanks so much in advance !

  • I think what you want is to pack intervals of data.  I found several articles that helped me with that challenge, which I needed to solve for working hours.  Here are the links:
    http://blogs.solidq.com/en/sqlserver/packing-intervals/
    http://www.sqlservercentral.com/articles/T-SQL/71550/
    I can never remember how to do this on my own, so I always go back to the samples, then change them for my specific case.  Perhaps that will work for you as well.

  • tommired2000 99625 - Friday, March 22, 2019 1:47 AM

    Dear  Communitiy,

    hope you can help me solving this issue:

    I have to group continuous timeslots together:

    Example:

    DECLARE @test-2 as Table (ID int, tFrom datetime, tUntil dateTime)

    insert into @test-2 Values (1,'2019-1-1 12:00', '2019-1-1 13:00')

    insert into @test-2 Values (1,'2019-1-1 13:00', '2019-1-1 14:00')

    insert into @test-2 Values (1,'2019-1-1 14:00', '2019-1-1 16:00')

    insert into @test-2 Values (1,'2019-1-1 18:00', '2019-1-1 19:00')

    insert into @test-2 Values (1,'2019-1-1 19:00', '2019-1-1 20:00')

    insert into @test-2 Values (1,'2019-1-1 20:00', '2019-1-1 21:00')

    insert into @test-2 Values (1,'2019-1-1 22:00', '2019-1-1 23:00')

    insert into @test-2 Values (2,'2019-1-1 12:00', '2019-1-1 13:00')

    insert into @test-2 Values (2,'2019-1-1 13:00', '2019-1-1 14:00')

    insert into @test-2 Values (2,'2019-1-1 14:00', '2019-1-1 16:00')

    insert into @test-2 Values (2,'2019-1-1 18:00', '2019-1-1 19:00')

    insert into @test-2 Values (2,'2019-1-1 19:00', '2019-1-1 20:00')

    insert into @test-2 Values (2,'2019-1-1 20:00', '2019-1-1 21:00')

    insert into @test-2 Values (2,'2019-1-1 22:00', '2019-1-1 23:00')

    Expected result:

    1; 2019-1-1 12:00; 2019-1-1 16:00

    1; 2019-1-1 18:00; 2019-1-1 21:00

    1; 2019-1-1 22:00; 2019-1-1 23:00

    2; 2019-1-1 12:00; 2019-1-1 16:00

    2; 2019-1-1 18:00; 2019-1-1 21:00

    2; 2019-1-1 22:00; 2019-1-1 23:00

    Would you so great if you can help me solving this :-)! Thanks so much in advance !

    Your table has uneven intervals so my solution that Jonathan posted a link for wouldn't work so nicely.  Itzik's solution was written well before some of the cool stuff came out in 2012 and his 3rd solution suggests a different method using Oracle... which SQL Server 2012 and up is now capable of capable of.

    I wish I could say that the following solution was my original idea but I can't.  My first glimpse at a super simple technique came from a good fellow by the name of Bert Wagner at the following link.  It's similar to Itzik's 3rd solution but a bit simpler.
    Gaps and Islands Across Date Ranges[/url]

    Using your test data (as a Temp Table rather than a Table Variable) and modifying Bert's good solution to be "top down" and to allow it have multiple partitions based on the ID, the following code will do what you want.  Comments are in the code and I strongly urge you to read Bert's good article.  Keep in mind that I used cCTEs (cascading CTEs) from the top down and Bert used nested SELECTs from the bottom up.  Other than that, the principle of the code is identical.

    I do have to post a warning though... I've not completed testing either for functionality nor performance.  I've used Bert's test data and, now, the test data from this post and they've both come out correctly but more testing is needed.

    With that in mind, here's a possible solution including your test data as a Temp Table rather than a tough to experiment with Table Variable (because it doesn't persist for partial runs of nested code).  And, apologies for the lousy indenting... the forum software deletes some leading spaces.  I hope they fix that someday. :Whistling:

    /**********************************************************************************************************************
    "Islands Across Date Ranges"

    Programmer's Notes:
    1. This code hasn't been tested for every scenario. It's only been tested with Bert's original data (see Rev 00)
      and the data from a different post (see Rev 01).
    2. This code has not yet been tested for performance.
    3. Bottom line here is "USE WITH CAUTION" until I and others can test more.

    Revision History
    Rev 00 - 12 March 2019 - Original Concept by Bert Wagner
        Ref: http://www.sqlservercentral.com/blogs/bert-wagner/2019/03/12/gaps-and-islands-across-date-ranges/
    Rev 01 - 25 March 2019 - Jeff Moden
        Modified for groups and top down reading for the following posted problem.
        https://www.sqlservercentral.com/Forums/2025012/TSQL-Grouping-continuous-timeslots-together

    **********************************************************************************************************************/
    --=====================================================================================================================
    --  Create and populate the test table. This is NOT a part of the solution.
    --=====================================================================================================================
    --===== If the test table already exists, drop it to make reruns in SSMS easier for demo purposes.
      IF OBJECT_ID('tempdb..#TestTable','U') IS NOT NULL
     DROP TABLE #TestTable
    ;
    --===== Create the test table. No indexes or keys here. We're just demonstrating a simple principle.
      -- A real table would have a PK, Clustered Index, etc.
    CREATE TABLE #TestTable
       (ID INT, tFrom DATETIME, tUntil DATETIME)
    ;
    --===== Populate the test table.
    INSERT INTO #TestTable
       (ID, tFrom, tUntil)
    VALUES (1,'2019-01-01 12:00','2019-01-01 13:00')
       ,(1,'2019-01-01 13:00','2019-01-01 14:00')
       ,(1,'2019-01-01 14:00','2019-01-01 16:00')
       ,(1,'2019-01-01 18:00','2019-01-01 19:00')
       ,(1,'2019-01-01 19:00','2019-01-01 20:00')
       ,(1,'2019-01-01 20:00','2019-01-01 21:00')
       ,(1,'2019-01-01 22:00','2019-01-01 23:00')
       ,(2,'2019-01-01 12:00','2019-01-01 13:00')
       ,(2,'2019-01-01 13:00','2019-01-01 14:00')
       ,(2,'2019-01-01 14:00','2019-01-01 16:00')
       ,(2,'2019-01-01 18:00','2019-01-01 19:00')
       ,(2,'2019-01-01 19:00','2019-01-01 20:00')
       ,(2,'2019-01-01 20:00','2019-01-01 21:00')
       ,(2,'2019-01-01 22:00','2019-01-01 23:00')
    ;
    --=====================================================================================================================
    --  Solve the problem from the "top down" using cCTEs (Cascading CTEs)
    --=====================================================================================================================
     WITH
    cteGroups AS
    (--==== This numbers the rows according to start and end dates partitioned by the ID.
      -- It also includes the previous end date using LAG.
    SELECT ID, tFrom, tUntil
       ,RN    = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY tFrom, tUntil)
       ,Prev_tUntil = LAG(tUntil,1) OVER (PARTITION BY ID ORDER BY tFrom, tUntil)
     FROM #TestTable
    )
    ,cteIslands AS
    (--==== If the previous end date is >= than the current start date, then we're in the same group of dates.
      -- The formula simply counts when that doesn't happen (not in same group), which means we started a new group.
      -- This only works because we already sorted the RN by ID and dates.
    SELECT *
      -- ,IslandStartInd =  CASE WHEN Prev_tUntil >= tFrom THEN 0 ELSE 1 END --Just for "show", can uncomment to see
       ,IslandId   = SUM(CASE WHEN Prev_tUntil >= tFrom THEN 0 ELSE 1 END) OVER (PARTITION BY ID ORDER BY RN)
     FROM cteGroups
    )--==== Now, just get the min start date and max end date for each island within each ID and we're done.
    SELECT ID
       ,IslandFrom  = MIN(tFrom)
       ,IslandtUntil = MAX(tUntil)
     FROM cteIslands
    GROUP BY ID, IslandId
    ORDER BY ID, IslandFrom
    ;


    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This is based on an update that Itzik made New Solution to the Packing Intervals Problem

    WITH interval_starts AS
    (
        SELECT *, CASE WHEN tFrom > LAG(tUntil, 1, '1900') OVER(PARTITION BY ID ORDER BY tFrom, tUntil) THEN 1 ELSE 0 END AS is_start
        FROM #TestTable tt
    )
    , interval_groups AS
    (
        SELECT *, SUM(is_start) OVER(PARTITION BY ID ORDER BY tFrom, tUntil ROWS UNBOUNDED PRECEDING) AS grp
        FROM interval_starts i
    )
    SELECT ID, MIN(tFrom) AS tFrom, MAX(tUntil) AS tUntil
    FROM interval_groups g
    GROUP BY ID, grp
    ORDER BY ID, tFrom

    It appears that it runs ever so slightly faster, and I think that's because it has one fewer sort.  (I skipped the ROW_NUMBER(), so I don't have to sort to get the row number and then sort on the row number.)

    I don't have time to set up an extensive test bed.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen - Tuesday, March 26, 2019 2:49 PM

    This is based on an update that Itzik made New Solution to the Packing Intervals Problem

    WITH interval_starts AS
    (
        SELECT *, CASE WHEN tFrom > LAG(tUntil, 1, '1900') OVER(PARTITION BY ID ORDER BY tFrom, tUntil) THEN 1 ELSE 0 END AS is_start
        FROM #TestTable tt
    )
    , interval_groups AS
    (
        SELECT *, SUM(is_start) OVER(PARTITION BY ID ORDER BY tFrom, tUntil ROWS UNBOUNDED PRECEDING) AS grp
        FROM interval_starts i
    )
    SELECT ID, MIN(tFrom) AS tFrom, MAX(tUntil) AS tUntil
    FROM interval_groups g
    GROUP BY ID, grp
    ORDER BY ID, tFrom

    It appears that it runs ever so slightly faster, and I think that's because it has one fewer sort.  (I skipped the ROW_NUMBER(), so I don't have to sort to get the row number and then sort on the row number.)

    I don't have time to set up an extensive test bed.

    Drew

    Nicely done, Drew!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply