Calculate Working Hour Query

  • Hi All,

    I am maintaining a login and logout in my web application.

    My tables entries as below.

    SNO EMPIDSTATUSDATE

    113645320417LOGIN7/2/10 3:21 PM

    113648320417PAUSE7/2/10 3:25 PM

    113649320417RESUME7/2/10 3:25 PM

    113650320417OUT 7/2/10 3:54 PM

    113651320418LOGIN7/2/10 4:21 PM

    113652320418PAUSE7/2/10 4:25 PM

    113653320418RESUME7/2/10 4:25 PM

    113654320418OUT 7/2/10 4:54 PM

    Can any one suggest me how can i calculate total working hour for each employee?

    Thanks!

    Shatrugnhna

    Shatrughna

  • Next time, please present your question in more helpfull way (you can find the link to how to do so in the bottom of my signature)!

    I have added some more test data, which contain records for the multiple days and situation where user didn't paused.

    declare @timeLog table (SNO int, EMPID int, [STATUS] char(10), [DATE] Datetime)

    insert @timeLog

    select 113645, 320417, 'LOGIN', '7/2/10 3:21 PM'

    union select 113648, 320417, 'PAUSE', '7/2/10 3:23 PM'

    union select 113649, 320417, 'RESUME', '7/2/10 3:24 PM'

    union select 113650, 320417, 'OUT', '7/2/10 3:54 PM'

    union select 113651, 320418, 'LOGIN', '7/2/10 4:21 PM'

    union select 113652, 320418, 'PAUSE', '7/2/10 4:31 PM'

    union select 113653, 320418, 'RESUME', '7/2/10 4:33 PM'

    union select 113654, 320418, 'OUT', '7/2/10 4:54 PM'

    union select 113655, 320418, 'LOGIN', '7/3/10 4:21 PM'

    union select 113656, 320418, 'PAUSE', '7/3/10 4:26 PM'

    union select 113657, 320418, 'RESUME', '7/3/10 4:28 PM'

    union select 113658, 320418, 'OUT', '7/3/10 4:54 PM'

    union select 113659, 320419, 'LOGIN', '7/3/10 4:21 PM'

    union select 113662, 320419, 'OUT', '7/3/10 4:54 PM'

    ;with minutePeriods

    as

    (

    select EMPID, [STATUS], [DATE]

    ,CASE WHEN [STATUS] = 'LOGIN' or [STATUS] = 'OUT' THEN 1 ELSE 2 END AS periodGroup

    ,ROW_NUMBER() OVER(PARTITION BY EMPID

    ORDER BY CASE WHEN [STATUS] = 'LOGIN' or [STATUS] = 'OUT' THEN 1 ELSE 2 END

    ,[DATE] DESC) AS PosNo

    ,ROW_NUMBER() OVER(PARTITION BY EMPID

    ORDER BY CASE WHEN [STATUS] = 'LOGIN' or [STATUS] = 'OUT' THEN 1 ELSE 2 END

    ,[DATE] DESC) + 1 AS NxtPosNo

    from @timeLog

    )

    , workedTime

    AS

    (

    SELECT f.EMPID

    , SUM(CASE WHEN f.[STATUS] = 'OUT' THEN DATEDIFF(MINUTE,s.[DATE],f.[DATE]) ELSE 0 END) -- total worked minutes from start to end (including time-breaks)

    - SUM(CASE WHEN f.[STATUS] = 'RESUME' THEN DATEDIFF(MINUTE,s.[DATE],f.[DATE]) ELSE 0 END) -- total paused minutes

    AS TotalMinutesWorked

    FROM minutePeriods f

    JOIN minutePeriods s

    ON s.EMPID = f.EMPID

    AND s.periodGroup = f.periodGroup

    AND s.PosNo = f.NxtPosNo

    WHERE f.[STATUS] = 'OUT' OR f.[STATUS] = 'RESUME'

    GROUP BY f.EMPID

    )

    select EMPID

    ,CAST(TotalMinutesWorked/60 AS VARCHAR(11)) + ' Hrs ' +

    CAST(TotalMinutesWorked - TotalMinutesWorked/60 * 60 AS VARCHAR(11)) + 'Min' AS WorkedTime

    from workedTime

    As you can see, the calculation will only take in count "closed" periods.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Thanks.

    It resolved my problem.

    Really helpful.

    In future post topic i will follow your suggestion.

    Thanks,

    Shatrughna

    Shatrughna

  • You are welcome!

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • For the emps with matching in/out and pause/resume rows [and presumably you edit for those who don't match up], you don't have to go thru all that.

    select EMPID, cast(WorkMins / 60 AS varchar(5)) + ':' +

    right('0' + cast(WorkMins % 60 AS varchar(2)), 2) as [Work HH:MM]

    from (

    select EMPID,

    (sum(case when STATUS = 'out' then datediff(minute, '19700101', [DATE]) else 0 end) -

    sum(case when STATUS = 'login' then datediff(minute, '19700101', [DATE]) else 0 end)) -

    (sum(case when STATUS = 'resume' then datediff(minute, '19700101', [DATE]) else 0 end) -

    sum(case when STATUS = 'pause' then datediff(minute, '19700101', [DATE]) else 0 end)) WorkMins

    from @timelog

    --where [DATE] between @startDate and @endDate

    group by EMPID

    having sum(case when STATUS = 'out' then 1 else 0 end) = sum(case when STATUS = 'login' then 1 else 0 end)

    and sum(case when STATUS = 'resume' then 1 else 0 end) = sum(case when STATUS = 'pause' then 1 else 0 end)

    ) as derived

    order by EMPID

    EDIT: Changed from typo COUNT to obvious SUM in the having clause.

    Scott Pletcher, SQL Server MVP 2008-2010

  • I don't like complex queries over large datasets when you can easily cache the data:

    declare @timeLog table (SNO int, EMPID int, [STATUS] char(10), [DATE] Datetime, [DURATION] int)

    the logout logic would look (something) like:

    GO

    create procdbo.spLogout

    @EmpID int

    as

    --

    declare@LoginDate datetime

    declare@status char(10)

    declare@Now datetime

    --

    selecttop 1

    @LoginDate = [DATE]

    ,@status = [STATUS]

    from@timeLog

    where[STATUS] in ('LOGIN', 'OUT')

    and[EMPID] = @EmpID

    order by[SNO] desc

    --

    if@status <> 'LOGIN'

    return0 -- no open session

    --

    set@Now = getutcdate()

    --

    insert into@timeLog ([EMPID], [STATUS], [DATE], [DURATION])

    select@EmpID, 'OUT', @Now, datediff(second, @LoginDate, @Now)

    --

    return1 -- success

    --(end)

    GO

    that is simpler, uses less runtime resources (cpu, io, etc.) but uses a bit more storage

  • scott.pletcher (7/3/2010)


    For the emps with matching in/out and pause/resume rows [and presumably you edit for those who don't match up], you don't have to go thru all that.

    ...

    If I will add two more test records as

    insert @timeLog

    ...

    union select 113663, 320420, 'LOGIN', '7/3/10 4:21 PM'

    union select 113664, 320421, 'OUT', '7/3/10 4:26 PM'

    You will see that your query will return strange results as to make it work, COUNT should be changed to SUM in the HAVING clause.

    But, If I will add the following test data:

    insert @timeLog

    ...

    union select 113665, 320422, 'OUT', '7/3/10 4:21 PM'

    union select 113666, 320422, 'LOGIN', '7/3/10 4:26 PM'

    The query will stop working. As it is important to check if the LOGIN comes before OUT in the given time frame.

    I guess futher modifications can be made to make your query work for all situations...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • even simpler and more efficient is to use a single record per session:

    create table dbo.TimeLog (

    [TimeLogID] int identity

    ,[EmpID] int not null

    ,[LoginTime] datetime not null

    ,[LogoutTime] datetime

    ,[PauseDuration] int not null

    ,[SessionDuration] int not null

    ,[IsPaused] datetime

    )--create table

    if you want to store the pause start times - use a separate table

    that way you don't need to consider more than one record at any time

    just apply the KISS principal ... one record is better than many ...

    a simple record lookup is better than a 100 line query ...

    etc etc etc

  • doobya (7/4/2010)


    I don't like complex queries over large datasets when you can easily cache the data:

    declare @timeLog table (SNO int, EMPID int, [STATUS] char(10), [DATE] Datetime, [DURATION] int)

    the logout logic would look (something) like:

    GO

    create procdbo.spLogout

    @EmpID int

    as

    ...

    It is may be a good idea, but it will only work to get the working time for a single employee at the time.

    What about if want a report?

    Will you call a stored proc to calculate working time for all or sub-set of employees in a loop?

    It will not perform well enough I'm afraid...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • doobya (7/4/2010)


    even simpler and more efficient is to use a single record per session:

    create table dbo.TimeLog (

    [TimeLogID] int identity

    ,[EmpID] int not null

    ,[LoginTime] datetime not null

    ,[LogoutTime] datetime

    ,[PauseDuration] int not null

    ,[SessionDuration] int not null

    ,[IsPaused] datetime

    )--create table

    if you want to store the pause start times - use a separate table

    ...

    I would not recommend such table design for logging details in OLTP system. It looks more like a table from reporting database, where the data needs to be denormailized.

    Just think how this table can be used. When to insert the record? In time of login or after the log-off? If insert a record in time of login, then what will happen if there are multiple pauses? You will need to update its aggregated duration every time! I would not say this will look nice in your code, also you will miss the fact that it was mulitple pauses while the whole purpose of logging is to get as much audit details as possible.

    And one more thing, I know it is typo, but it's funny: [IsPaused] datetime

    I have seen once column [CountryCode] defined as BIT!

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • I would not recommend such table design for logging details in OLTP system. It looks more like a table from reporting database, where the data needs to be denormailized.

    Just think how this table can be used. When to insert the record? In time of login or after the log-off? If insert a record in time of login, then what will happen if there are multiple pauses? You will need to update its aggregated duration every time! I would not say this will look nice in your code, also you will miss the fact that it was mulitple pauses while the whole purpose of logging is to get as much audit details as possible.

    And one more thing, I know it is typo, but it's funny: [IsPaused] datetime

    I have seen once column [CountryCode] defined as BIT!

    - multiple pauses are aggregated with [PauseDuration] = [PauseDuration] += datediff(second, [IsPaused], getutcdate())

    - and [SessionDuration] = datediff(second, [LoginTime], getutcdate()) - [PauseDuration]

    - you can record the exact same amount of data by recording only pause data to a separate table as I mentioned

    [PauseID],[TimeLogID],[PauseStartTime],[PauseDuration]

    - this will simplify everything else - especially duration becomes a lookup from a single record!

    - IsPaused is *not* a typo - it acts as a bit field if [IsPaused] is null // but also stores the start time of the Pause

    storing each event in a separate record is not a good idea - the crazy queries given above show that

    you are spreading information for a single "thing" (a login session) across multiple records

    which is not a good idea

    plus if you analyse the two approaches from the point of view of:

    - simplicity in implementation and implementation of related functionality

    - for example my version of query for duration becomes:

    select [SessionDuration] from dbo.TimeLog where [TimeLogID] = 1

    can you find a more efficient sql query than that? a single field / record lookup on a unique clustered index?

    - performance / cpu / io costs

    - size of storage space required

    there is no competition - my single row per session approach wins hands down

    the two table approach also allows you to implement two different purging policies - one for sessions / one for pauses

    IMO the OPs initial design is a mistake because it trades simplicity of insert for complexity of query

    which isn't a good deal - basic common sense - you only insert once but you query N times

    therefore increase the complexity of the insert and reduce the complexity of the query = correct

    what led to the design error was not thinking ahead - if the OP had considered HOW the data was going to be queried

    he would have realised his design was wrong - good engineering involves considering MANY inputs to arrive at a solid design

    that will exhibit the "positive spiral" effect - the idea that everything related to a good design is also good

    he was experiencing the "negative spiral" of a bad design - he couldn't find a simple way to query for duration

  • It is not very good practice to call a column IsPaused to store start time of the Pause as it is totally missleading (and looks stupid, question: IsPaused? answer: 1 Jan 2010 17:29:20 )! In your design, at least, this column should be called PauseStartTime and you are free how to interpret its null value.

    I would not design the table in OLTP thinking a lot about what is the best for getting summary reports out of it. It is true that your desing would help the query reporting session and logon durations summary. But, it is not OLTP purpose to provide the data for reporting in the best possible layout - you design dedicated reporting database for this!

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • the primary function of the IsPaused column is "is this session paused"

    the secondary function of the IsPaused column is "if this session is paused - when was it paused"

    that is why it is called "IsPaused"

    question: IsPaused? answer: [IsPaused] is not null

    your logic is ... illogical

    you are saying that you should design a database but ignore the needs of applications using it

    reporting systems are designed to workaround the weakness of SQL as a report query language

    and the concurrency conflict between OLTP and OLAP

    usually they lag the OLTP data, often by up to a day

    they are not designed to patch up a bad design

    and are not suitable or required for this basic application

    if a bank was to take your advice they would drop the Balance column from the Accounts table and maintain only a Transactions table

    calculating the balance with a large aggregate query over the last N years worth of transactions ????

    I think you need to drastically rethink your approach to database design

  • doobya (7/5/2010)


    the primary function of the IsPaused column is "is this session paused"

    the secondary function of the IsPaused column is "if this session is paused - when was it paused"

    that is why it is called "IsPaused"

    question: IsPaused? answer: [IsPaused] is not null

    your logic is ... illogical

    you are saying that you should design a database but ignore the needs of applications using it

    reporting systems are designed to workaround the weakness of SQL as a report query language

    and the concurrency conflict between OLTP and OLAP

    usually they lag the OLTP data, often by up to a day

    they are not designed to patch up a bad design

    and are not suitable or required for this basic application

    if a bank was to take your advice they would drop the Balance column from the Accounts table and maintain only a Transactions table

    calculating the balance with a large aggregate query over the last N years worth of transactions ????

    I think you need to drastically rethink your approach to database design

    1. In terms of the "primary use" and column names, may be your logic is a bullet proof in your head, but is is illogical for me.

    Lets agree on our dissagreements here. I do believe that having datetime column named "PauseStartTime" and interpreting its null value as you wish is way better than naming datetime column "IsPaused" and interpreting its datetime null value as "false" and non-null value as "true". Do you really want to put this for vote?

    2. Regarding banks...

    I don't know if you ever worked with real-time bank's transaction systems...

    However, even you admit that table which holds transactions doesn't need to have balance column at all (actually it doesn't take any sence in terms of bank transaction nature). Well, you will, most likely, find Balance column (or actually few of them) on Account table, yes, but it is in no way guaranteed account balance it is just a tracking one, usually ingnoring some of the credit transaction, making it quite pessimistic. Exact account balance, in most of the real life banking systems, will be calculated at the end of business day (depends on account type) based on account balance from previous day + all transactions against this account since last time balance calculation. You may not believe, but there are very good business and scientific reasons behind of it.

    Any way, logging table presented by this OP represents "a la Transaction" table. As I can see you don't advocate to have Balance column in it, you have it in Account table (kind of summary one in your view), therefore advising change of design to the OP, you better suggest something allong the lines:

    If you have requirements, in your system, to often perform statistic analyses of the logging details, you may consider adding another table where such denormailized pre-calculated statiscs will be stored.

    Depending of the system scale and importance, I might be follow this advise...

    BTW, I have been consulting some banks in this area for quite a bit...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (7/5/2010)


    1. In terms of the "primary use" and column names, may be your logic is a bullet proof in your head, but is is illogical for me.

    Lets agree on our dissagreements here. I do believe that having datetime column named "PauseStartTime" and interpreting its null value as you wish is way better than naming datetime column "IsPaused" and interpreting its datetime null value as "false" and non-null value as "true". Do you really want to put this for vote?

    It is a small issue - as long as you can find a naming convention that is consistent and relevant - choose any name you like 🙂

    I believe the purpose of the column is more relevant to its name than its type - and calling it PauseStartTime will be confusing because

    it hides its primary function (a flag) - whereas "[IsPaused] datetime" clearly shows two functions

    Voting has no value - lots of people disagreeing with me doesn't mean my point of view isn't better *-)

    in fact any history book will show that the majority tend to disagree with good / new ideas

    Eugene Elutin (7/5/2010)2. Regarding banks...

    I don't know if you ever worked with real-time bank's transaction systems...

    However, even you admit that table which holds transactions doesn't need to have balance column at all (actually it doesn't take any sence in terms of bank transaction nature). Well, you will, most likely, find Balance column (or actually few of them) on Account table, yes, but it is in no way guaranteed account balance it is just a tracking one, usually ingnoring some of the credit transaction, making it quite pessimistic. Exact account balance, in most of the real life banking systems, will be calculated at the end of business day (depends on account type) based on account balance from previous day + all transactions against this account since last time balance calculation. You may not believe, but there are very good business and scientific reasons behind of it.

    Any way, logging table presented by this OP represents "a la Transaction" table. As I can see you don't advocate to have Balance column in it, you have it in Account table (kind of summary one in your view), therefore advising change of design to the OP, you better suggest something allong the lines:

    If you have requirements, in your system, to often perform statistic analyses of the logging details, you may consider adding another table where such denormailized pre-calculated statiscs will be stored.

    Depending of the system scale and importance, I might be follow this advise...

    BTW, I have been consulting some banks in this area for quite a bit...

    I think we have some agreement here then

    I haven't worked on an actual bank - but I have worked on realtime payment and micropayment systems that are essentially the same

    and maintaining an up-to-date balance for each account is very important (it is real money after all)

    Although for high TPS micropayments we used lazy balance checks and message queues to keep the performance up in exchange for some financial risk

    Worked out OK though

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply