Calculate Working Hour Query

Question

Calculate Working Hour Query

shatrughna

Ten Centuries

Points: 1037
More actions
July 2, 2010 at 4:43 am

#237176

Hi All,
I am maintaining a login and logout in my web application.
My tables entries as below.
SNO EMPIDSTATUSDATE
113645320417LOGIN7/2/10 3:21 PM
113648320417PAUSE7/2/10 3:25 PM
113649320417RESUME7/2/10 3:25 PM
113650320417OUT 7/2/10 3:54 PM
113651320418LOGIN7/2/10 4:21 PM
113652320418PAUSE7/2/10 4:25 PM
113653320418RESUME7/2/10 4:25 PM
113654320418OUT 7/2/10 4:54 PM
Can any one suggest me how can i calculate total working hour for each employee?
Thanks!
Shatrugnhna
Shatrughna

Viewing 15 posts - 1 through 15 (of 25 total)

You must be logged in to reply to this topic. Login to reply

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 1

Next time, please present your question in more helpfull way (you can find the link to how to do so in the bottom of my signature)!

I have added some more test data, which contain records for the multiple days and situation where user didn't paused.

declare @timeLog table (SNO int, EMPID int, [STATUS] char(10), [DATE] Datetime)

insert @timeLog

select 113645, 320417, 'LOGIN', '7/2/10 3:21 PM'

union select 113648, 320417, 'PAUSE', '7/2/10 3:23 PM'

union select 113649, 320417, 'RESUME', '7/2/10 3:24 PM'

union select 113650, 320417, 'OUT', '7/2/10 3:54 PM'

union select 113651, 320418, 'LOGIN', '7/2/10 4:21 PM'

union select 113652, 320418, 'PAUSE', '7/2/10 4:31 PM'

union select 113653, 320418, 'RESUME', '7/2/10 4:33 PM'

union select 113654, 320418, 'OUT', '7/2/10 4:54 PM'

union select 113655, 320418, 'LOGIN', '7/3/10 4:21 PM'

union select 113656, 320418, 'PAUSE', '7/3/10 4:26 PM'

union select 113657, 320418, 'RESUME', '7/3/10 4:28 PM'

union select 113658, 320418, 'OUT', '7/3/10 4:54 PM'

union select 113659, 320419, 'LOGIN', '7/3/10 4:21 PM'

union select 113662, 320419, 'OUT', '7/3/10 4:54 PM'

;with minutePeriods

as

(

select EMPID, [STATUS], [DATE]

,CASE WHEN [STATUS] = 'LOGIN' or [STATUS] = 'OUT' THEN 1 ELSE 2 END AS periodGroup

,ROW_NUMBER() OVER(PARTITION BY EMPID

ORDER BY CASE WHEN [STATUS] = 'LOGIN' or [STATUS] = 'OUT' THEN 1 ELSE 2 END

,[DATE] DESC) AS PosNo

,ROW_NUMBER() OVER(PARTITION BY EMPID

ORDER BY CASE WHEN [STATUS] = 'LOGIN' or [STATUS] = 'OUT' THEN 1 ELSE 2 END

,[DATE] DESC) + 1 AS NxtPosNo

from @timeLog

)

, workedTime

AS

(

SELECT f.EMPID

, SUM(CASE WHEN f.[STATUS] = 'OUT' THEN DATEDIFF(MINUTE,s.[DATE],f.[DATE]) ELSE 0 END) -- total worked minutes from start to end (including time-breaks)

- SUM(CASE WHEN f.[STATUS] = 'RESUME' THEN DATEDIFF(MINUTE,s.[DATE],f.[DATE]) ELSE 0 END) -- total paused minutes

AS TotalMinutesWorked

FROM minutePeriods f

JOIN minutePeriods s

ON s.EMPID = f.EMPID

AND s.periodGroup = f.periodGroup

AND s.PosNo = f.NxtPosNo

WHERE f.[STATUS] = 'OUT' OR f.[STATUS] = 'RESUME'

GROUP BY f.EMPID

)

select EMPID

,CAST(TotalMinutesWorked/60 AS VARCHAR(11)) + ' Hrs ' +

CAST(TotalMinutesWorked - TotalMinutesWorked/60 * 60 AS VARCHAR(11)) + 'Min' AS WorkedTime

from workedTime

As you can see, the calculation will only take in count "closed" periods.

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

shatrughna Ten Centuries Points: 1037 More actions · Answer 2

Thanks.

It resolved my problem.

Really helpful.

In future post topic i will follow your suggestion.

Thanks,

Shatrughna

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 3

You are welcome!

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

scott.pletcher SSCertifiable Points: 5958 More actions · Answer 4

For the emps with matching in/out and pause/resume rows [and presumably you edit for those who don't match up], you don't have to go thru all that.

select EMPID, cast(WorkMins / 60 AS varchar(5)) + ':' +

right('0' + cast(WorkMins % 60 AS varchar(2)), 2) as [Work HH:MM]

from (

select EMPID,

(sum(case when STATUS = 'out' then datediff(minute, '19700101', [DATE]) else 0 end) -

sum(case when STATUS = 'login' then datediff(minute, '19700101', [DATE]) else 0 end)) -

(sum(case when STATUS = 'resume' then datediff(minute, '19700101', [DATE]) else 0 end) -

sum(case when STATUS = 'pause' then datediff(minute, '19700101', [DATE]) else 0 end)) WorkMins

from @timelog

--where [DATE] between @startDate and @endDate

group by EMPID

having sum(case when STATUS = 'out' then 1 else 0 end) = sum(case when STATUS = 'login' then 1 else 0 end)

and sum(case when STATUS = 'resume' then 1 else 0 end) = sum(case when STATUS = 'pause' then 1 else 0 end)

) as derived

order by EMPID

EDIT: Changed from typo COUNT to obvious SUM in the having clause.

Scott Pletcher, SQL Server MVP 2008-2010

DataDog Hall of Fame Points: 3305 More actions · Answer 5

I don't like complex queries over large datasets when you can easily cache the data:

declare @timeLog table (SNO int, EMPID int, [STATUS] char(10), [DATE] Datetime, [DURATION] int)

the logout logic would look (something) like:

GO

create procdbo.spLogout

@EmpID int

as

--

declare@LoginDate datetime

declare@status char(10)

declare@Now datetime

--

selecttop 1

@LoginDate = [DATE]

,@status = [STATUS]

from@timeLog

where[STATUS] in ('LOGIN', 'OUT')

and[EMPID] = @EmpID

order by[SNO] desc

--

if@status <> 'LOGIN'

return0 -- no open session

--

set@Now = getutcdate()

--

insert into@timeLog ([EMPID], [STATUS], [DATE], [DURATION])

select@EmpID, 'OUT', @Now, datediff(second, @LoginDate, @Now)

--

return1 -- success

--(end)

GO

that is simpler, uses less runtime resources (cpu, io, etc.) but uses a bit more storage

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 6

scott.pletcher (7/3/2010)
For the emps with matching in/out and pause/resume rows [and presumably you edit for those who don't match up], you don't have to go thru all that.
...

If I will add two more test records as

insert @timeLog

...

union select 113663, 320420, 'LOGIN', '7/3/10 4:21 PM'

union select 113664, 320421, 'OUT', '7/3/10 4:26 PM'

You will see that your query will return strange results as to make it work, COUNT should be changed to SUM in the HAVING clause.

But, If I will add the following test data:

insert @timeLog

...

union select 113665, 320422, 'OUT', '7/3/10 4:21 PM'

union select 113666, 320422, 'LOGIN', '7/3/10 4:26 PM'

The query will stop working. As it is important to check if the LOGIN comes before OUT in the given time frame.

I guess futher modifications can be made to make your query work for all situations...

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

DataDog Hall of Fame Points: 3305 More actions · Answer 7

even simpler and more efficient is to use a single record per session:

create table dbo.TimeLog (

[TimeLogID] int identity

,[EmpID] int not null

,[LoginTime] datetime not null

,[LogoutTime] datetime

,[PauseDuration] int not null

,[SessionDuration] int not null

,[IsPaused] datetime

)--create table

if you want to store the pause start times - use a separate table

that way you don't need to consider more than one record at any time

just apply the KISS principal ... one record is better than many ...

a simple record lookup is better than a 100 line query ...

etc etc etc

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 8

doobya (7/4/2010)
I don't like complex queries over large datasets when you can easily cache the data:
declare @timeLog table (SNO int, EMPID int, [STATUS] char(10), [DATE] Datetime, [DURATION] int)
the logout logic would look (something) like:
GO
create procdbo.spLogout
@EmpID int
as
...

It is may be a good idea, but it will only work to get the working time for a single employee at the time.

What about if want a report?

Will you call a stored proc to calculate working time for all or sub-set of employees in a loop?

It will not perform well enough I'm afraid...

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 9

doobya (7/4/2010)
even simpler and more efficient is to use a single record per session:
create table dbo.TimeLog (
[TimeLogID] int identity
,[EmpID] int not null
,[LoginTime] datetime not null
,[LogoutTime] datetime
,[PauseDuration] int not null
,[SessionDuration] int not null
,[IsPaused] datetime
)--create table
if you want to store the pause start times - use a separate table
...

I would not recommend such table design for logging details in OLTP system. It looks more like a table from reporting database, where the data needs to be denormailized.

Just think how this table can be used. When to insert the record? In time of login or after the log-off? If insert a record in time of login, then what will happen if there are multiple pauses? You will need to update its aggregated duration every time! I would not say this will look nice in your code, also you will miss the fact that it was mulitple pauses while the whole purpose of logging is to get as much audit details as possible.

And one more thing, I know it is typo, but it's funny: [IsPaused] datetime

I have seen once column [CountryCode] defined as BIT!

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

DataDog Hall of Fame Points: 3305 More actions · Answer 10

I would not recommend such table design for logging details in OLTP system. It looks more like a table from reporting database, where the data needs to be denormailized.
Just think how this table can be used. When to insert the record? In time of login or after the log-off? If insert a record in time of login, then what will happen if there are multiple pauses? You will need to update its aggregated duration every time! I would not say this will look nice in your code, also you will miss the fact that it was mulitple pauses while the whole purpose of logging is to get as much audit details as possible.
And one more thing, I know it is typo, but it's funny: [IsPaused] datetime
I have seen once column [CountryCode] defined as BIT!

- multiple pauses are aggregated with [PauseDuration] = [PauseDuration] += datediff(second, [IsPaused], getutcdate())

- and [SessionDuration] = datediff(second, [LoginTime], getutcdate()) - [PauseDuration]

- you can record the exact same amount of data by recording only pause data to a separate table as I mentioned

[PauseID],[TimeLogID],[PauseStartTime],[PauseDuration]

- this will simplify everything else - especially duration becomes a lookup from a single record!

- IsPaused is *not* a typo - it acts as a bit field if [IsPaused] is null // but also stores the start time of the Pause

storing each event in a separate record is not a good idea - the crazy queries given above show that

you are spreading information for a single "thing" (a login session) across multiple records

which is not a good idea

plus if you analyse the two approaches from the point of view of:

- simplicity in implementation and implementation of related functionality

- for example my version of query for duration becomes:

select [SessionDuration] from dbo.TimeLog where [TimeLogID] = 1

can you find a more efficient sql query than that? a single field / record lookup on a unique clustered index?

- performance / cpu / io costs

- size of storage space required

there is no competition - my single row per session approach wins hands down

the two table approach also allows you to implement two different purging policies - one for sessions / one for pauses

IMO the OPs initial design is a mistake because it trades simplicity of insert for complexity of query

which isn't a good deal - basic common sense - you only insert once but you query N times

therefore increase the complexity of the insert and reduce the complexity of the query = correct

what led to the design error was not thinking ahead - if the OP had considered HOW the data was going to be queried

he would have realised his design was wrong - good engineering involves considering MANY inputs to arrive at a solid design

that will exhibit the "positive spiral" effect - the idea that everything related to a good design is also good

he was experiencing the "negative spiral" of a bad design - he couldn't find a simple way to query for duration

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 11

It is not very good practice to call a column IsPaused to store start time of the Pause as it is totally missleading (and looks stupid, question: IsPaused? answer: 1 Jan 2010 17:29:20 )! In your design, at least, this column should be called PauseStartTime and you are free how to interpret its null value.

I would not design the table in OLTP thinking a lot about what is the best for getting summary reports out of it. It is true that your desing would help the query reporting session and logon durations summary. But, it is not OLTP purpose to provide the data for reporting in the best possible layout - you design dedicated reporting database for this!

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

DataDog Hall of Fame Points: 3305 More actions · Answer 12

the primary function of the IsPaused column is "is this session paused"

the secondary function of the IsPaused column is "if this session is paused - when was it paused"

that is why it is called "IsPaused"

question: IsPaused? answer: [IsPaused] is not null

your logic is ... illogical

you are saying that you should design a database but ignore the needs of applications using it

reporting systems are designed to workaround the weakness of SQL as a report query language

and the concurrency conflict between OLTP and OLAP

usually they lag the OLTP data, often by up to a day

they are not designed to patch up a bad design

and are not suitable or required for this basic application

if a bank was to take your advice they would drop the Balance column from the Accounts table and maintain only a Transactions table

calculating the balance with a large aggregate query over the last N years worth of transactions ????

I think you need to drastically rethink your approach to database design

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 13

doobya (7/5/2010)
the primary function of the IsPaused column is "is this session paused"
the secondary function of the IsPaused column is "if this session is paused - when was it paused"
that is why it is called "IsPaused"
question: IsPaused? answer: [IsPaused] is not null
your logic is ... illogical
you are saying that you should design a database but ignore the needs of applications using it
reporting systems are designed to workaround the weakness of SQL as a report query language
and the concurrency conflict between OLTP and OLAP
usually they lag the OLTP data, often by up to a day
they are not designed to patch up a bad design
and are not suitable or required for this basic application
if a bank was to take your advice they would drop the Balance column from the Accounts table and maintain only a Transactions table
calculating the balance with a large aggregate query over the last N years worth of transactions ????
I think you need to drastically rethink your approach to database design

1. In terms of the "primary use" and column names, may be your logic is a bullet proof in your head, but is is illogical for me.

Lets agree on our dissagreements here. I do believe that having datetime column named "PauseStartTime" and interpreting its null value as you wish is way better than naming datetime column "IsPaused" and interpreting its datetime null value as "false" and non-null value as "true". Do you really want to put this for vote?

2. Regarding banks...

I don't know if you ever worked with real-time bank's transaction systems...

However, even you admit that table which holds transactions doesn't need to have balance column at all (actually it doesn't take any sence in terms of bank transaction nature). Well, you will, most likely, find Balance column (or actually few of them) on Account table, yes, but it is in no way guaranteed account balance it is just a tracking one, usually ingnoring some of the credit transaction, making it quite pessimistic. Exact account balance, in most of the real life banking systems, will be calculated at the end of business day (depends on account type) based on account balance from previous day + all transactions against this account since last time balance calculation. You may not believe, but there are very good business and scientific reasons behind of it.

Any way, logging table presented by this OP represents "a la Transaction" table. As I can see you don't advocate to have Balance column in it, you have it in Account table (kind of summary one in your view), therefore advising change of design to the OP, you better suggest something allong the lines:

If you have requirements, in your system, to often perform statistic analyses of the logging details, you may consider adding another table where such denormailized pre-calculated statiscs will be stored.

Depending of the system scale and importance, I might be follow this advise...

BTW, I have been consulting some banks in this area for quite a bit...

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

DataDog Hall of Fame Points: 3305 More actions · Answer 14

Eugene Elutin (7/5/2010)
1. In terms of the "primary use" and column names, may be your logic is a bullet proof in your head, but is is illogical for me.
Lets agree on our dissagreements here. I do believe that having datetime column named "PauseStartTime" and interpreting its null value as you wish is way better than naming datetime column "IsPaused" and interpreting its datetime null value as "false" and non-null value as "true". Do you really want to put this for vote?

It is a small issue - as long as you can find a naming convention that is consistent and relevant - choose any name you like 🙂

I believe the purpose of the column is more relevant to its name than its type - and calling it PauseStartTime will be confusing because

it hides its primary function (a flag) - whereas "[IsPaused] datetime" clearly shows two functions

Voting has no value - lots of people disagreeing with me doesn't mean my point of view isn't better *-)

in fact any history book will show that the majority tend to disagree with good / new ideas

Eugene Elutin (7/5/2010)2. Regarding banks...
I don't know if you ever worked with real-time bank's transaction systems...
However, even you admit that table which holds transactions doesn't need to have balance column at all (actually it doesn't take any sence in terms of bank transaction nature). Well, you will, most likely, find Balance column (or actually few of them) on Account table, yes, but it is in no way guaranteed account balance it is just a tracking one, usually ingnoring some of the credit transaction, making it quite pessimistic. Exact account balance, in most of the real life banking systems, will be calculated at the end of business day (depends on account type) based on account balance from previous day + all transactions against this account since last time balance calculation. You may not believe, but there are very good business and scientific reasons behind of it.
Any way, logging table presented by this OP represents "a la Transaction" table. As I can see you don't advocate to have Balance column in it, you have it in Account table (kind of summary one in your view), therefore advising change of design to the OP, you better suggest something allong the lines:
If you have requirements, in your system, to often perform statistic analyses of the logging details, you may consider adding another table where such denormailized pre-calculated statiscs will be stored.
Depending of the system scale and importance, I might be follow this advise...
BTW, I have been consulting some banks in this area for quite a bit...

I think we have some agreement here then

I haven't worked on an actual bank - but I have worked on realtime payment and micropayment systems that are essentially the same

and maintaining an up-to-date balance for each account is very important (it is real money after all)

Although for high TPS micropayments we used lazy balance checks and message queues to keep the performance up in exchange for some financial risk

Worked out OK though