Performance Issue with Time Tracking application

  • Hi,

    I am having some serious performance issues when trying to represent time and attendance data from our Access control system. The system keeps a archive table which logs all In and Out (This includes internal doors, which is not of interest currently) occurances for all staff members on site.

    The table structure is as follows:

    CREATE TABLE #myTable(

    [x_hist_type] [int] NULL,

    [x_panel_name] [char](32) NULL,

    [x_term_name] [char](32) NULL,

    [x_badge_number] [char](20) NULL,

    [x_timestamp] [datetime] NOT NULL,

    [x_guid] [uniqueidentifier] ROWGUIDCOL NOT NULL,

    PRIMARY KEY NONCLUSTERED

    (

    [x_guid] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    Sample Data:

    INSERT INTO #myTable

    (x_hist_type,x_panel_name,x_term_name,x_badge_number,x_timestamp,x_guid)

    SELECT '68','Turnstiles ','Turnstile In ','508 ','Aug 2 2010 7:20AM','0D9D5265-85BE-46A9-96BC-359F997BF493' UNION ALL

    SELECT '68','Turnstiles ','Turnstile Out ','508 ','Aug 2 2010 8:47AM','83395CFE-2BC9-4192-8CE2-7193CF3996DF' UNION ALL

    SELECT '68','Turnstiles ','Turnstile In ','508 ','Aug 2 2010 8:50AM','D60A9DD6-5312-428B-ADA7-4FDAB9866DF8' UNION ALL

    SELECT '68','Turnstiles ','Turnstile Out ','508 ','Aug 2 2010 4:13PM','BC5ADC5A-48A2-463F-BC68-BAA670ACC712' UNION ALL

    SELECT '68','Turnstiles ','Turnstile In ','508 ','Aug 3 2010 7:13AM','CD18BA98-EA51-4B8E-BDD7-BD96CA5BC924' UNION ALL

    SELECT '68','Turnstiles ','Turnstile Out ','508 ','Aug 3 2010 9:41AM','BF7F845F-E931-4FAC-8181-D62C51D898F3' UNION ALL

    SELECT '68','Turnstiles ','Turnstile Out ','508 ','Aug 3 2010 9:41AM','F57CAA2D-0539-41F4-A489-3B13BB37B7C5' UNION ALL

    SELECT '68','Turnstiles ','Turnstile In ','508 ','Aug 3 2010 11:34AM','0CB79832-0A36-4F2D-96D2-FEF85CD15992' UNION ALL

    SELECT '68','Turnstiles ','Turnstile Out ','508 ','Aug 3 2010 5:15PM','C5A93520-408E-4B95-B63A-953FC9999578' UNION ALL

    SELECT '68','Turnstiles ','Turnstile In ','508 ','Aug 4 2010 7:21AM','891DD8F0-8099-46EF-B4B5-9AEBB90C1981' UNION ALL

    SELECT '68','Turnstiles ','Turnstile Out ','508 ','Aug 4 2010 9:50AM','898AF8EC-2C52-4D1C-85E7-ADEDDFACE325' UNION ALL

    SELECT '68','Turnstiles ','Turnstile In ','508 ','Aug 4 2010 12:17PM','6BC94FDF-D15F-4410-BF0B-CA8EC218596B' UNION ALL

    SELECT '68','Turnstiles ','Turnstile Out ','508 ','Aug 4 2010 4:40PM','10052249-BD44-499C-BCED-0669E2319B88'

    This sample data represents a common problem which we have. Sometimes when a terminal is scanned it doesn't release the door so the terminal is scanned a second time which causes two entries with a couple of seconds between them.

    The code I have generated to handle this data is as follows, it gives the correct results but takes a long time to complete for 1 user nevermind the entire factory.

    WITH cte_1 -- order by time and spilt to InTime, OutTime

    AS (SELECT

    x_badge_number

    ,case WHEN rtrim(x_term_name) LIKE '%IN' THEN x_timestamp ELSE NULL END AS InTime

    ,case WHEN rtrim(x_term_name) LIKE '%OUT' THEN x_timestamp ELSE NULL END AS OutTime

    ,case when rtrim(x_term_name) LIKE '%IN' THEN 'IN' ELSE 'OUT' END AS Terminal

    ,row_number() OVER (PARTITION BY x_badge_number ORDER BY x_timestamp ASC ) AS rn

    FROM

    #myTable

    WHERE x_panel_name = 'turnstiles' AND x_hist_type = 68 -- and x_badge_number IN (3518,506)

    )

    ,

    cte_2 -- mark those that repeat

    AS (SELECT

    t.x_badge_number

    ,t.InTime

    ,t.OutTime

    ,t.Terminal

    ,t.rn

    ,case( SELECT Terminal FROM cte_1 AS x WHERE x.rn = t.rn - 1 AND x.x_badge_number = t.x_badge_number) WHEN t.Terminal THEN 1 ELSE 0 END AS mrk

    FROM cte_1 AS t

    )

    ,

    cte_3 --extract non repeats and group

    AS (SELECT

    x_badge_number, InTime, OutTime, Terminal, rn, mrk

    ,row_number() OVER ( PARTITION BY Terminal ORDER BY x_badge_number,rn ASC ) AS rn2

    FROM cte_2

    WHERE mrk = 0

    )

    SELECT

    t1.x_badge_number

    ,t1.InTime

    ,t2.Outtime

    ,datediff(MINUTE, t1.InTime, t2.OutTime) AS Duration

    FROM

    cte_3 AS t1

    INNER JOIN cte_3 AS t2 ON t1.rn2 = t2.rn2

    WHERE

    t1.Intime IS NOT NULL

    AND t2.Outtime IS NOT NULL

    ORDER BY

    t1.rn

    This is my first post so i hope I gave included all the required information.

    Oh, the output from this is like as follows:

    3518 2010-07-06 07:56:12.0002010-07-06 16:51:21.000535

    3518 2010-07-08 08:02:50.0002010-07-08 16:50:33.000528

    3518 2010-07-09 08:11:32.0002010-07-09 16:47:36.000516

    3518 2010-07-12 08:02:27.0002010-07-12 16:40:16.000518

    3518 2010-07-13 07:51:40.0002010-07-13 16:46:05.000535

    3518 2010-07-14 08:06:37.0002010-07-14 17:03:59.000537

    3518 2010-07-15 08:09:20.0002010-07-15 16:44:18.000515

    Is there anything that I could do too improve the speed of this code?

    Thanks,

    Anthony

  • I've found through a lot of testing that the use of ROW_NUMBER when returning more than a few rows has a tendency to degrade faster than using a TOP with an ORDER BY, assuming you have a good index for ordering the data. I posted the tests here at the PASS web site (it requires a login, sorry, but it's free to set up). I've updated the article and it'll be coming out in Simple-Talk within a week or two. Anyway, I'd try rewriting with the TOP statement. Other than that, your approach looks good to me.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply