Matching employee application log ins to log outs

  • [font="Verdana"]

    I'm in the process of building some ETL logic for our data warehouse that merges employee application logon records with logoff records. This is used to generate session statistics such as transaction throughput for the session.

    I've run into a bit of a problem that's got me stumped, and I'm thinking someone here will hopefully be able to point me in the right direction!

    The issue is that I have been matching the logon to the logoff based on the employee number and the logon time/logoff time. I match the latest logon time to the earliest logoff time, where the logon time is before the logoff time.

    However, in my test data I have an example where the following must have happened:

    1. An employee logs on to one machine

    2. The same employee logs on to another machine, without having logged off

    3. The employee logs off the first machine

    4. The employee logs off the second machine

    My matching code for this returns only one record, not two. It matches the logon for the second session to the logoff for the first session. Ooops.

    (It actually doesn't matter in which order they log off. They're not supposed to do this, so it doesn't matter if I get the logouts in the wrong order. What does matter is that I correctly get two sessions, not one.)

    Any ideas on how I can resolve this scenario so it correctly returns two records would be greatly appreciated! Here's some example code illustrating the issue.

    if object_id('dbo.Logon') is not null

    drop table dbo.Logon;

    create table dbo.Logon(

    LogonID int identity(1,1) not null,

    LogonTime datetime not null,

    EmployeeNumber int not null,

    constraint Logon_PK primary key (

    LogonID

    ),

    constraint Logon_AK unique (

    LogonTime,

    EmployeeNumber

    )

    );

    set identity_insert dbo.Logon on;

    insert into dbo.Logon(

    LogonID,

    LogonTime,

    EmployeeNumber

    )

    select 1, '20090316 10:15:10', 1 union all

    select 2, '20090316 10:16:53', 1;

    set identity_insert dbo.Logon off;

    if object_id('dbo.Logoff') is not null

    drop table dbo.Logoff;

    create table dbo.Logoff(

    LogoffID int identity(1,1) not null,

    LogoffTime datetime not null,

    EmployeeNumber int not null,

    constraint Logoff_PK primary key (

    LogoffID

    ),

    constraint Logoff_AK unique (

    LogoffTime,

    EmployeeNumber

    )

    );

    set identity_insert dbo.Logoff on;

    insert into dbo.Logoff(

    LogoffID,

    LogoffTime,

    EmployeeNumber

    )

    select 1, '20090316 12:01:47', 1 union all

    select 2, '20090316 17:48:11', 1;

    set identity_insert dbo.Logoff off;

    --

    -- get the latest logon that matches to the earliest logoff

    --

    select EmployeeNumber,

    min(LogoffTime) as LogoffTime,

    LogonTime

    from (

    select Logoff.EmployeeNumber,

    Logoff.LogoffTime,

    max(Logon.LogonTime) as LogonTime

    from dbo.Logon

    inner join

    dbo.Logoff

    on Logon.EmployeeNumber = Logoff.EmployeeNumber and

    Logon.LogonTime < Logoff.LogoffTime

    group by

    Logoff.EmployeeNumber,

    Logoff.LogoffTime

    ) x

    group by

    EmployeeNumber,

    LogonTime;

    Edited to correct code formatting and fix the fact that I can't remember whether I am using "logon/logoff" or "login/logout".

    [/font]

  • Do you ever miss a login or a logout? If the answer is no, then you should be able to just count the logins and logouts and match then up based on equal counts.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • If you have issues with employees logging in simultaneously on multiple machines, you may want to add another field to the login tracking table to track machine name using the HOST_NAME() function. We use this for audit trail purposes, although it is rendered less useful when you have multiple users on the same Citrix server.

    If you capture the host name, then you have another field to match against.

  • Is this what you're looking for?

    ;WITH Logons

    AS

    (

    SELECTROW_NUMBER() OVER( PARTITION BY EmployeeNumber ORDER BY LogonTime ) AS RowNumber,

    LogonTime, EmployeeNumber

    FROMLogon

    ),

    Logoffs

    AS

    (

    SELECTROW_NUMBER() OVER( PARTITION BY EmployeeNumber ORDER BY LogoffTime ) AS RowNumber,

    LogoffTime, EmployeeNumber

    FROMLogoff

    )

    SELECTI.EmployeeNumber, O.LogoffTime, I.LogonTime

    FROMLogons I

    LEFT JOIN Logoffs O ON I.EmployeeNumber = O.EmployeeNumber AND I.RowNumber = O.RowNumber

    --Ramesh


  • Hi Bruce

    I have had a similar business case. I used the @@SPID for a lookup into the system view dm_exec_sessions.

    You can use the session_id and the login_time as a unique key. At user logoff you can use the information from your login table (or again from system view) and link your logoff to the previous logon.

    Greets

    Flo

  • RBarryYoung (3/29/2009)


    Do you ever miss a login or a logout? If the answer is no, then you should be able to just count the logins and logouts and match then up based on equal counts.

    [font="Verdana"]That would be nice, wouldn't it?

    They have only just started recording logins (new business requirement.) So I have a great deal of historical logouts with no corresponding logins. I don't yet have any production data to show whether it's possible to get logouts without logins for other reasons.

    Also, it's possible to crash out of the application, and thus not record a logout.

    So yes, both logins and logouts can be missed.

    [/font]

  • Ross McMicken (3/30/2009)


    If you have issues with employees logging in simultaneously on multiple machines, you may want to add another field to the login tracking table to track machine name using the HOST_NAME() function. We use this for audit trail purposes, although it is rendered less useful when you have multiple users on the same Citrix server.

    [font="Verdana"]Um... yes. Did I mention that one of the data sources is an application running across a mainframe? The concept of a "host" becomes somewhat fluid.

    Also, one of the joys of being within a business intelligence division is that you have almost no influence over making changes to applications. So if they are not collecting the data, we can't have it.

    It's not entirely that black and white, but suffice to say that to get a change to a production system in place there's a reasonable amount of time and effort required.[/font]

  • Florian Reischl (3/30/2009)


    Hi Bruce

    I have had a similar business case. I used the @@SPID for a lookup into the system view dm_exec_sessions.

    You can use the session_id and the login_time as a unique key. At user logoff you can use the information from your login table (or again from system view) and link your logoff to the previous logon.

    [font="Tahoma"]Thanks Flo. Yes, @@spid would do it (in combination with the @@servername as much of the data is collected on deparmental servers), but I don't get that information (see my previous post).

    I hadn't thought of using it as a lookup into the dm_exec_sessions table. That's very clever! I'll keep that in mind. 😀

    [/font]

  • Ramesh (3/30/2009)


    Is this what you're looking for? (Code with ranking)

    [font="Verdana"]Hey Ramesh,

    Yes, I was thinking I could do something better with ranking functions, but my brain just wasn't going there. That works well for my simplified example. Sadly, it doesn't work quite so well where I have missed logins or logouts.

    Having said that, I still think it's a better approach than what I was using with the combination of min() and max(). So I will have a play around with that further.

    [/font]

  • [font="Verdana"]Thanks for the replies: certainly appreciated.

    I've had over a day to think about this now, and here's an update to what I've come up with.

    It turns out that overlapping sessions is quite common place: they may have the application open for enquiries while working on a transaction on another instance of the application.

    Because I have to allocate transactions to the sessions (session defined by a login and a logout), it becomes problematic when we have overlapping sessions.

    So basically where I have the following:

    Login for A

    Login for B

    Logout for A (or B)

    Logout for B (or A)

    I will refactor it somewhat so I end up with:

    Login for A

    Logout for A (generated)

    Login for B

    Logout for B (potentially generated)

    Login for A (potentially generated)

    Logout for A

    By doing this I can ensure there is only one effective session at any one time. That makes it easy to allocate transactions to sessions (which is used to measure throughput against the session duration).

    It shouldn't effect the results as they are summed/averaged over the entire day. Well, that's not entirely true: if they login just to do enquiries before they begin processing, then yes it will effect results. So that needs to be communicated back to the users as a bad idea.

    Now I just have to figure out how best to do this...

    [/font]

  • Bruce W Cassidy (3/29/2009)


    The issue is that I have been matching the logon to the logoff based on the employee number and the logon time/logoff time. I match the latest logon time to the earliest logoff time, where the logon time is before the logoff time.

    However, in my test data I have an example where the following must have happened:

    1. An employee logs on to one machine

    2. The same employee logs on to another machine, without having logged off

    3. The employee logs off the first machine

    4. The employee logs off the second machine

    My matching code for this returns only one record, not two. It matches the logon for the second session to the logoff for the first session. Ooops.

    ...

    Any ideas on how I can resolve this scenario so it correctly returns two records would be greatly appreciated! Here's some example code illustrating the issue.

    Given all that we have learned about your situation, I think that the best/only reasonable course of action is, instead of matching logon/logoff based on employee number, match them based on (employee number + machine).

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (3/31/2009)


    Given all that we have learned about your situation, I think that the best/only reasonable course of action is, instead of matching logon/logoff based on employee number, match them based on (employee number + machine).

    [font="Verdana"]That would be nice too. Two minor issues:

    1. The details of the machine they log onto are not recorded.

    2. They can have multiple concurrent sessions open on the same machine (which is common.)

    The reason we are capturing logins now is that we need to record in which team they were working. Which team they are working in effects the expected transaction throughput. The same person can work across multiple teams in the same day.

    Up until now a spreadsheet was kept by the supervisors to capture the amount of time they worked within a team. Now we wish to replace that spreadsheet by actually getting the login from when they sign in, along with what team they will be working against.

    What also makes things tricky is that they can also work a shift that crosses midnight.

    So here's what I've decided to do:

    1. Keep a "rolling team" for an employee, only using the login times from when they switch teams.

    2. After a team switch, find the first transaction. Start the session duration from there.

    3. Whenever there is a gap between transactions of half an hour

    - use the last transaction as the session logout time

    - use the next transaction as the session login time.

    This allows them to walk away from the PC for meetings and lunch or dinner and not be penalised for having forgotten to log out (which happens a lot.) It also allows them to have multiple concurrent sessions open at once without it being an issue.

    Anyway, now I need to write this up in the design document and get it agreed to by the business.

    Thanks all for your advice! I've ended up having to walk away from using the combination of login times and logout times altogether, but that's what the design process is all about.

    [/font]

  • I too had somewhat similar situation where I need to process a file (imported by the user) , then dump it in a table and return back data of that particular session. Here, I'd let the application provide me the session id (a 32 byte GUID) and save the session id in the table.

    I'm not saying this is the right way to do, but still I think, since the sql server has no-way to know which request came from which application session (i.e. in your case, its the logout request related to login request). So, its the application's job to let sql server know that the session of the request.

    --Ramesh


Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply