Combine overlapping datetime to return single overlapping range record

  • npranj

    SSCrazy

    Points: 2323

    For e.g. -- I have these records in a table.

    Eg.

    Table: Overlapping Days

    Has following data –

    ID StartDate EndDate

    -----------------------------------------------------------------

    1 01-01-2009 10:00:00 01-01-2009 10:40:00

    2 01-01-2009 12:00:00 01-01-2009 14:00:00

    3 01-01-2009 12:30:00 01-01-2009 15:00:00

    4 01-01-2009 19:00:00 01-01-2009 19:30:00

    5 01-01-2009 20:05:00 01-01-2009 20:25:00

    6 01-01-2009 19:45:00 01-01-2009 20:20:00

    Result required:

    StartDate EndDate

    -------------------------------------------

    01-01-2009 10:00:00 01-01-2009 10:40:00

    01-01-2009 12:00:00 01-01-2009 15:00:00

    01-01-2009 19:00:00 01-01-2009 19:30:00

    01-01-2009 19:45:00 01-01-2009 20:25:00

    · We need to club the overlapping events and return rows accordingly.

    · There could be more combinations..of Startdate and enddate in the Source

    I am able to find out rows that are overlapping but I need to group these into sets and find Min(StartDate) and Max(EndDate) for each such set.

    Sample Table- DDL

    ------------------------

    CREATE TABLE [Overlapping](

    [DKey] [int] IDENTITY(1,1) NOT NULL,

    [StartDate] [datetime] NULL,

    [EndDate] [datetime] NULL

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [Overlapping] ON

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (1, CAST(0x00009B8400B54640 AS DateTime), CAST(0x00009B8400BD83A0 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (2, CAST(0x00009B8400000000 AS DateTime), CAST(0x00009B8400107AC0 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (3, CAST(0x00009B8400A4CB80 AS DateTime), CAST(0x00009B8400B964F0 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (4, CAST(0x00009B84009450C0 AS DateTime), CAST(0x00009B840099CF00 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (5, CAST(0x00009B840099CF00 AS DateTime), CAST(0x00009B84009DEDB0 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (6, CAST(0x00009B84009C8E20 AS DateTime), CAST(0x00009B8400A0ACD0 AS DateTime))

    SET IDENTITY_INSERT [Overlapping] OFF

  • LutzM

    SSC Guru

    Points: 107049

    Please provide sample data as described in the first link in my signature together with what you've tried so far.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • npranj

    SSCrazy

    Points: 2323

    Hi Lutz,

    Sample table and data -

    CREATE TABLE [Overlapping](

    [DKey] [int] IDENTITY(1,1) NOT NULL,

    [StartDate] [datetime] NULL,

    [EndDate] [datetime] NULL

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [Overlapping] ON

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (1, CAST(0x00009B8400B54640 AS DateTime), CAST(0x00009B8400BD83A0 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (2, CAST(0x00009B8400000000 AS DateTime), CAST(0x00009B8400107AC0 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (3, CAST(0x00009B8400A4CB80 AS DateTime), CAST(0x00009B8400B964F0 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (4, CAST(0x00009B84009450C0 AS DateTime), CAST(0x00009B840099CF00 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (5, CAST(0x00009B840099CF00 AS DateTime), CAST(0x00009B84009DEDB0 AS DateTime))

    INSERT [Overlapping] ([DKey], [StartDate], [EndDate]) VALUES (6, CAST(0x00009B84009C8E20 AS DateTime), CAST(0x00009B8400A0ACD0 AS DateTime))

    SET IDENTITY_INSERT [Overlapping] OFF

    What I have tried so far -

    select

    A.DKey,

    MIN(A.StartDate) StartD,

    MAX(B.EndDate) EndD

    from Overlapping A

    cross join Overlapping B

    where B.StartDate between A.StartDate and A.EndDate

    and (A.StartDate <> B.StartDate and A.EndDate <> B.EndDate )

    group by A.DKey

    order by StartD,EndD

    This is not complete and it doesnt work for a scenario wherein there are more than two overlapping events. Writing this down just to let you know the direction of my thouights on this.

  • Mark Cowne

    One Orange Chip

    Points: 26685

    Try this

    SELECT s1.StartDate,

    MIN(t1.EndDate) AS EndDate

    FROM Overlapping s1

    INNER JOIN Overlapping t1 ON s1.StartDate <= t1.EndDate

    AND NOT EXISTS(SELECT * FROM Overlapping t2

    WHERE t1.EndDate >= t2.StartDate AND t1.EndDate < t2.EndDate)

    WHERE NOT EXISTS(SELECT * FROM Overlapping s2

    WHERE s1.StartDate > s2.StartDate AND s1.StartDate <= s2.EndDate)

    GROUP BY s1.StartDate

    ORDER BY s1.StartDate

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • npranj

    SSCrazy

    Points: 2323

    Thank you very much. This works fine. 🙂

  • steven.oates

    SSC-Addicted

    Points: 405

    Morning,

    Apologies if there is a rule of somekind that states not to reopen old posts, this is my first time posting so be gentle 😉

    I found this post extremely useful afer trawling the internet for a solution to my problem

    Although I'm 90% there I still have a problem and hopefully someone can help.

    I have an additional few columns in the mix, but for example purposes let's just say 1 column which is employee.

    So effecively I have multiple employees with overlapping times, I need the times merged (as above) but for each employee.

    Here's the DDL:

    CREATE TABLE [Overlapping](

    [DKey] [int] IDENTITY(1,1) NOT NULL,

    [EmployeeId] [int] NULL,

    [LoginTime] [varchar](8) NULL,

    [LogOutTime] [varchar](8) NULL

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [Overlapping] ON

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (1, 500 ,'07:47:19', '08:03:05')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (2, 500 ,'07:47:19','09:30:06')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (3, 500 ,'07:47:19','19:55:32')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (4, 500 ,'09:47:08', '11:30:13')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (5, 500 ,'11:45:56','13:14:51')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (6, 500 ,'13:47:38', '14:33:32')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (7, 500 ,'15:45:49', '19:55:27')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (8, 501 , '07:13:13','07:18:04')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (9, 501 , '07:13:13','07:13:26')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (10, 501 , '08:35:56','08:36:20')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (11, 501 , '08:35:56','09:00:00')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (12, 501 , '14:08:08','18:30:26')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (12, 501 , '15:45:25','18:30:32')

    SET IDENTITY_INSERT [Overlapping] OFF

    I've tried running:

    Select * from Overlapping

    SELECT

    s1.EmployeeId,

    s1.LogInTime,

    MIN(t1.LogOutTime) AS LogOutTime

    FROM Overlapping s1

    INNER JOIN Overlapping t1 ON s1.LogInTime <= t1.LogOutTime

    AND NOT EXISTS(SELECT * FROM Overlapping t2

    WHERE t1.LogOutTime >= t2.LogInTime AND t1.LogOutTime < t2.LogOutTime)

    WHERE NOT EXISTS(SELECT * FROM Overlapping s2

    WHERE s1.LogInTime > s2.LogInTime AND s1.LogInTime <= s2.LogOutTime)

    GROUP BY s1.EmployeeId,s1.LogInTime

    ORDER BY s1.LogInTime

    But I dont get the results I expect.

    If I only insert the records for employee 501 in the table (for example) and run it it works a treat.

    Please help 🙂

    Thanks

    Steve

  • steven.oates

    SSC-Addicted

    Points: 405

    Sorry,... just to be clear on what I would expect, if I run for 1 employee this is what I get:

    Load Table

    8 501 07:13:13 07:18:04

    9 501 07:13:13 07:13:26

    10 501 08:35:56 08:36:20

    11 501 08:35:56 09:00:00

    12 501 14:08:08 18:30:26

    12 501 15:45:25 18:30:32

    Combined Result

    501 07:13:13 07:18:04

    501 08:35:56 09:00:00

    501 14:08:08 18:30:32

    When run for more than 1 employee these are the results:

    DKey EmployeeId LoginTime LogOutTime

    1 500 07:47:19 08:03:05

    2 500 07:47:19 09:30:06

    3 500 07:47:19 19:55:32

    4 500 09:47:08 11:30:13

    5 500 11:45:56 13:14:51

    6 500 13:47:38 14:33:32

    7 500 15:45:49 19:55:27

    8 501 07:13:13 07:18:04

    9 501 07:13:13 07:13:26

    10 501 08:35:56 08:36:20

    11 501 08:35:56 09:00:00

    12 501 14:08:08 18:30:26

    12 501 15:45:25 18:30:32

    Result set

    EmployeeId LogInTime LogOutTime

    501 07:13:13 07:18:04

    500 07:47:19 19:55:32

    It doesnt group the data correctly

    Thanks

    Steve

  • charlie235

    Valued Member

    Points: 55

    Hi Steve,

    I haven't tested it on your data, but I guess you need to add an extra employeeID join condition to all your joins:

    SELECT

    s1.EmployeeId,

    s1.LogInTime,

    MIN(t1.LogOutTime) AS LogOutTime

    FROM Overlapping s1

    INNER JOIN Overlapping t1 ON s1.employeeID = t1.employeeID and s1.LogInTime <= t1.LogOutTime

    AND NOT EXISTS(SELECT * FROM Overlapping t2

    WHERE t1.employeeID = t2.employeeID and t1.LogOutTime >= t2.LogInTime AND t1.LogOutTime < t2.LogOutTime)

    WHERE NOT EXISTS(SELECT * FROM Overlapping s2

    WHERE s1.employeeID = s2.employeeID and s1.LogInTime > s2.LogInTime AND s1.LogInTime <= s2.LogOutTime)

    GROUP BY s1.EmployeeId,s1.LogInTime

    ORDER BY s1.LogInTime

    Cheers,

    Norbert

  • BradleyLeger

    Newbie

    Points: 1

    Mark Cowne wrote:

    Try this SELECT s1.StartDate, MIN(t1.EndDate) AS EndDate FROM Overlapping s1 INNER JOIN Overlapping t1 ON s1.StartDate <= t1.EndDate AND NOT EXISTS(SELECT * FROM Overlapping t2 WHERE t1.EndDate >= t2.StartDate AND t1.EndDate < t2.EndDate) WHERE NOT EXISTS(SELECT * FROM Overlapping s2 WHERE s1.StartDate > s2.StartDate AND s1.StartDate <= s2.EndDate) GROUP BY s1.StartDate ORDER BY s1.StartDate

    https://www.writemyessay24h.net/

     

    No recursion, no joins, no table updates in the loop, no “grouping by” - as a result, this solution should scale better (I think). I think that the number of scans can be reduced to two if the minimum and maximum dates are known in advance; Logic itself needs only two scans - find spaces applied twice.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply