Help Required to handle multiple employees with overlapping datetimes (concurrent activities)

  • Morning,

    I have a data set that contains employee shift data, when they logged in and logged out of a certain media type.

    The problem is that the employee could be logged into multiple media types at any given time.

    I need to report the employees total login duration for the day but obviously I cant just sum all the login time due to the concurrent activities it may be 12 hours when in actually real time it was only 7 hours.

    For example

    DKeyEmployeeIdLoginTimeLogOutTime

    850107:13:1307:18:04

    950107:13:1307:13:26

    1050108:35:5608:36:20

    1150108:35:5609:00:00

    1250114:08:0818:30:26

    1250115:45:2518:30:32

    In realtime would actually be this

    EmployeeIdLogInTimeLogOutTime

    50107:13:1307:18:04

    50108:35:5609:00:00

    50114:08:0818:30:32

    I've done a bit of digging on the internet forums etc and cant seem to crack it.

    I've added the DDl below which will help explain it a bit better, the first 2 result sets show it works if the source table only has 1 employee in it but when there is more than 1 it doesn't return the correct results.

    Please help

    Regards

    Steve

    CREATE TABLE [Overlapping](

    [DKey] [int] IDENTITY(1,1) NOT NULL,

    [EmployeeId] [int] NULL,

    [LoginTime] [varchar](8) NULL,

    [LogOutTime] [varchar](8) NULL

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [Overlapping] ON

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (8, 501 , '07:13:13','07:18:04')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (9, 501 , '07:13:13','07:13:26')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (10, 501 , '08:35:56','08:36:20')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (11, 501 , '08:35:56','09:00:00')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (12, 501 , '14:08:08','18:30:26')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (12, 501 , '15:45:25','18:30:32')

    SET IDENTITY_INSERT [Overlapping] OFF

    Select * from Overlapping

    SELECT

    s1.EmployeeId,

    s1.LogInTime,

    MIN(t1.LogOutTime) AS LogOutTime

    FROM Overlapping s1

    INNER JOIN Overlapping t1 ON s1.LogInTime <= t1.LogOutTime

    AND NOT EXISTS(SELECT * FROM Overlapping t2

    WHERE t1.LogOutTime >= t2.LogInTime AND t1.LogOutTime < t2.LogOutTime)

    WHERE NOT EXISTS(SELECT * FROM Overlapping s2

    WHERE s1.LogInTime > s2.LogInTime AND s1.LogInTime <= s2.LogOutTime)

    GROUP BY s1.EmployeeId,s1.LogInTime

    ORDER BY s1.LogInTime

    Drop table overlapping

    CREATE TABLE [Overlapping](

    [DKey] [int] IDENTITY(1,1) NOT NULL,

    [EmployeeId] [int] NULL,

    [LoginTime] [varchar](8) NULL,

    [LogOutTime] [varchar](8) NULL

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [Overlapping] ON

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (1, 500 ,'07:47:19', '08:03:05')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (2, 500 ,'07:47:19','09:30:06')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (3, 500 ,'07:47:19','19:55:32')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (4, 500 ,'09:47:08', '11:30:13')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (5, 500 ,'11:45:56','13:14:51')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (6, 500 ,'13:47:38', '14:33:32')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (7, 500 ,'15:45:49', '19:55:27')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (8, 501 , '07:13:13','07:18:04')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (9, 501 , '07:13:13','07:13:26')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (10, 501 , '08:35:56','08:36:20')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (11, 501 , '08:35:56','09:00:00')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (12, 501 , '14:08:08','18:30:26')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (12, 501 , '15:45:25','18:30:32')

    SET IDENTITY_INSERT [Overlapping] OFF

    Select * from Overlapping

    SELECT

    s1.EmployeeId,

    s1.LogInTime,

    MIN(t1.LogOutTime) AS LogOutTime

    FROM

    Overlapping s1

    INNER JOIN Overlapping t1 ON s1.LogInTime <= t1.LogOutTime

    And s1.EmployeeId = t1.EmployeeId

    AND NOT EXISTS(SELECT * FROM Overlapping t2

    WHERE t1.LogOutTime >= t2.LogInTime AND t1.LogOutTime < t2.LogOutTime

    Group By EmployeeId)

    WHERE NOT EXISTS(SELECT * FROM Overlapping s2

    WHERE s1.LogInTime > s2.LogInTime AND s1.LogInTime <= s2.LogOutTime

    Group By EmployeeId)

    GROUP BY s1.EmployeeId,s1.LogInTime

    ORDER BY s1.LogInTime

    Drop table overlapping

  • Are you familiar with this article by Jeff Moden on grouping islands of contiguous dates?

    http://www.sqlservercentral.com/articles/T-SQL/71550/

    I believe you can probably use this technique to solve your problem. I confess, I found it somewhat complicated to grasp at first and if I have some available time I may try to apply it directly to your problem.

    But I suggest you read it anyway so that even if someone does post a solution that uses it (likely), you understand how it works.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Sorry there my friend. I gave you the wrong link. Fortunately I saved a script with this incredibly elegant technique by SQL guru/MVP Itzik Ben-Gan.

    DECLARE @T TABLE

    (DKey INT, EmployeeId INT, LoginTime TIME, LogOutTime TIME)

    INSERT INTO @T

    SELECT 8,501,'07:13:13','07:18:04'

    UNION ALL SELECT 9,501,'07:13:13','07:13:26'

    UNION ALL SELECT 10,501,'08:35:56','08:36:20'

    UNION ALL SELECT 11,501,'08:35:56','09:00:00'

    UNION ALL SELECT 12,501,'14:08:08','18:30:26'

    UNION ALL SELECT 12,501,'15:45:25','18:30:32'

    SELECT *

    FROM @T

    -- Method by Itzik Ben-Gan

    -- http://www.solidq.com/sqj/Pages/2011-March-Issue/Packing-Intervals.aspx

    ;WITH C1 AS (

    SELECT EmployeeId, ts, Type

    ,e=CASE Type WHEN 1 THEN NULL ELSE ROW_NUMBER() OVER (PARTITION BY EmployeeId, Type ORDER BY LogoutTime) END

    ,s=CASE Type WHEN -1 THEN NULL ELSE ROW_NUMBER() OVER (PARTITION BY EmployeeId, Type ORDER BY LoginTime) END

    FROM @T

    CROSS APPLY (

    VALUES (1, LoginTime), (-1, LogoutTime)) a(Type, ts)

    ),

    C2 AS (

    SELECT C1.*

    ,se=ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY ts, Type DESC)

    FROM C1),

    C3 AS (

    SELECT EmployeeId, ts

    ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY ts)-1) / 2 + 1)

    FROM C2

    WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0)

    SELECT EmployeeId, StartDate=MIN(ts), EndDate=MAX(ts)

    FROM C3

    GROUP BY EmployeeId, grpnm

    The link to his article is in the code.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Wow thanks Dwain (and Itzik Ben-Gan) worked an absoulate treat 🙂

    Just to demontrate that it worked with multiple employees I've slightly tweaked it but all in all fantastic, I shall save that forever.

    Thanks

    Steve

    CREATE TABLE [Overlapping](

    [DKey] [int] IDENTITY(1,1) NOT NULL,

    [EmployeeId] [int] NULL,

    [LoginTime] [varchar](8) NULL,

    [LogOutTime] [varchar](8) NULL

    ) ON [PRIMARY]

    GO

    SET IDENTITY_INSERT [Overlapping] ON

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (1, 500 ,'07:47:19', '08:03:05')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (2, 500 ,'07:47:19','09:30:06')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (3, 500 ,'07:47:19','19:55:32')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (4, 500 ,'09:47:08', '11:30:13')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (5, 500 ,'11:45:56','13:14:51')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (6, 500 ,'13:47:38', '14:33:32')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (7, 500 ,'15:45:49', '19:55:27')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (8, 501 , '07:13:13','07:18:04')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (9, 501 , '07:13:13','07:13:26')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (10, 501 , '08:35:56','08:36:20')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (11, 501 , '08:35:56','09:00:00')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (12, 501 , '14:08:08','18:30:26')

    INSERT [Overlapping] ([DKey], [EmployeeId], [LoginTime], [LogOutTime]) VALUES (12, 501 , '15:45:25','18:30:32')

    SET IDENTITY_INSERT [Overlapping] OFF

    Select * from overlapping

    -- Method by Itzik Ben-Gan

    -- http://www.solidq.com/sqj/Pages/2011-March-Issue/Packing-Intervals.aspx

    ;WITH C1 AS (

    SELECT EmployeeId, ts, Type

    ,e=CASE Type WHEN 1 THEN NULL ELSE ROW_NUMBER() OVER (PARTITION BY EmployeeId, Type ORDER BY LogoutTime) END

    ,s=CASE Type WHEN -1 THEN NULL ELSE ROW_NUMBER() OVER (PARTITION BY EmployeeId, Type ORDER BY LoginTime) END

    FROM overlapping

    CROSS APPLY (

    VALUES (1, LoginTime), (-1, LogoutTime)) a(Type, ts)

    ),

    C2 AS (

    SELECT C1.*

    ,se=ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY ts, Type DESC)

    FROM C1),

    C3 AS (

    SELECT EmployeeId, ts

    ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY EmployeeId ORDER BY ts)-1) / 2 + 1)

    FROM C2

    WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0)

    SELECT EmployeeId, StartDate=MIN(ts), EndDate=MAX(ts)

    FROM C3

    GROUP BY EmployeeId, grpnm

    Drop table Overlapping

  • I must admit to only having had a quick glance at this, but couldn't you do something like this instead?

    SELECT EmployeeId, MIN(LoginTime) AS LoginTime, MAX(LogOutTime) AS LogOutTime

    FROM (SELECT DENSE_RANK() OVER(PARTITION BY base.EmployeeId ORDER BY base.LoginTime),

    base.EmployeeId, base.LoginTime, overLap.LogOutTime

    FROM Overlapping base

    INNER JOIN Overlapping overLap ON base.EmployeeId = overLap.EmployeeId

    WHERE (overLap.LoginTime > Base.LoginTime AND overLap.LoginTime < base.LogOutTime)

    OR (overLap.LoginTime = base.LoginTime AND overLap.LogOutTime > base.LogOutTime)

    AND base.DKey <> overLap.DKey

    )a(pos,EmployeeId,LoginTime,LogOutTime)

    GROUP BY pos, EmployeeId;

    Don't know whether or not it'd be any faster. . . could probably do with a performance check but I haven't really got the time to knock up a million row sample test to check.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Steven - You're welcome and sorry again for sending you the initial red herring, although Jeff's article is also one to read and save.

    Cadavre - I tested the Ben-Gan approach against several others (although not one that looked like yours) and it was lightening fast.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (12/14/2012)


    Cadavre - I tested the Ben-Gan approach against several others (although not one that looked like yours) and it was lightening fast.

    Don't suppose you still have the test scripts? Just curious what other implementations there are 🙂


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Cadavre (12/14/2012)


    dwain.c (12/14/2012)


    Cadavre - I tested the Ben-Gan approach against several others (although not one that looked like yours) and it was lightening fast.

    Don't suppose you still have the test scripts? Just curious what other implementations there are 🙂

    PM'd you with an answer.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • WOW!

    Ok so a little more background, the DDL was just a made up example that I thought if cracked would provide the solution to my real dataset.

    Thought being the definative word.

    The solution above does indeed work for multiple employees but only on a single date.

    I began a backdate on Friday and when I looked at the results it all went Pete Tong.

    So i've read over your explanation a good 5-6 times and still cant get my head round it (tbh I fell at the first hurdle when you mentioned HULU (wtf is hulu lol) - then I realised as I'm british it's acceptable to not know what hulu is 😉

    Anyway to the point, I've implemented your solution this morning and... Bang, the dirt is gone 🙂

    Worked a treat.

    I had to strip out the media type as I can report on them seperately, my solution was just to get the logged hours for the agent for the day.

    Thanks Celko 🙂

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply