SQLServerCentral Article

Working with Time Frames in T-SQL

,

Introduction

Several times recently I've had to find the maximum average hourly activity for a set of processes that take place over the course of a day. I have also had the need to determine which processes were taking place during that hour.

I suspect that this kind of problem might be pretty common and that the queries in this article could be used to determine the busiest time at a loading dock, or at a doctor's office or anything else that has a queue.

Each process usually last several hours and one or more processes can simultaneously taking place in the same hour. To compare hourly activity when information about the processes is stored as a start date/time and an end date/time the times must be re-framed into hourly chunks so that they can be summed and compared. In this article, I present a script that re-frames these overall process durations into hourly chunks so that they can be averaged and compared.

Overview

Here's how the script goes about the task of breaking a multi-day process into one hour chunks. Any time slice can have one of the following six relationships to any process, so these are the cases that the script has to accommodate.

  1. The time slice can begin after the process ends.
  2. The time slice can begin after the process starts and end before the process ends.
  3. The time slice can begin after the process begins and end after the process ends
  4. The time slice can begin before the process begins, and end after the process ends.
  5. The time slice can begin before the process begins and end before the process ends.
  6. The time slice can end after the process begins.

Here's an illustration of the different possible relationships of a sample process (yellow)

to a sample time slice (green).

gant chart

To determine how which processes were taking place during a given time frame requires that we look at only events that either started or ended during the time frame (cases 3,4, and 5, above), or spanned the time frame entirely (case 2, above). We can obviously exclude cases 1 and 6.

SELECT
COUNT(*)
FROM
[MY_EVENTS]
WHERE
---CASES 3, 4 AND 5
(([STARTING_TIME] > 7/1/2007 12:00 PM AND [STARTING_TIME] < 7/1/2007 1:00 PM)
OR ([ENDING_TIME] > 7/1/2007 12:00 PM AND [ENDING_TIME] < 7/1/2007 1:00 PM))
OR
---CASE 2
([STARTING_TIME] < 7/1/2007 12:00 PM AND [ENDING_TIME] > 7/1/2007 1:00 PM)

-----------
4

Slicing the Times

This single query only gets us part of the way. It tells us about only one time frame and it doesn't even tell us all there is to know about that one. What we learn from this query is that four processes were was taking place during the 12pm - 1pm time slice but we still don't know how much time in that time slice each of the processes consumed.

To do this, it helps to have these two SQL Server user-defined functions. One computes the max of two dates and the other computes the min of two dates. These functions allow us to frame the edges of our time period when we go to compute the amount of the time period that the event consumed.

CREATE FUNCTION DBO.DATEMAX(@DTDATE1 DATETIME,@DTDATE2 DATETIME)
RETURNS DATETIME
AS
BEGIN
       DECLARE @RTVALUE DATETIME
       IF @DTDATE1 > @DTDATE2
               SET @RTVALUE= @DTDATE1
       ELSE
               SET @RTVALUE= @DTDATE2
       RETURN @RTVALUE
END
CREATE FUNCTION DBO.DATEMIN(@DTDATE1 DATETIME,@DTDATE2 DATETIME)
RETURNS DATETIME
AS
BEGIN
       DECLARE @RTVALUE DATETIME
       IF @DTDATE1 < @DTDATE2
               SET @RTVALUE= @DTDATE1
       ELSE
               SET @RTVALUE= @DTDATE2
       RETURN @RTVALUE
END

Using these functions we can compute the exact number of minutes each process consumed during our 12:00 PM to 1:00 PM time frame:

SELECT
---EVENTS THAT BEGIN BEFORE THE PERIOD AND END AFTER IT,
---AND EVENTS THAT BEGIN BEFORE THE PERIOD AND END DURING IT.
    (SELECT SUM(
DATEDIFF(MI,
DBO.DATEMAX(A.[STARTING_TIME],'7/1/2007 12:00 PM'),
DBO.DATEMIN(A.[ENDING_TIME],'7/1/2007 1:00 PM')
))
    FROM
        [MY_EVENTS] A
    WHERE
        A.[STARTING_TIME] < '7/1/2007 12:00 PM'
        AND A.[ENDING_TIME] > '7/1/2007 12:00 PM'
)
+
---EVENTS THAT BEGIN DURING THE PERIOD
    (SELECT SUM(
DATEDIFF(MI,
A.[STARTING_TIME],
DBO.DATEMIN(A.[ENDING_TIME],'7/1/2007 1:00 PM')
))
    FROM
        [MY_EVENTS] A
    WHERE
        A.[STARTING_TIME] BETWEEN '7/1/2007 12:00 PM' AND '7/1/2007 1:00 PM'
)

-----------
150

While this might be useful, it doesn't compute the maximum hourly utilization during a day, nor does it enable us to show which processes took place within that hour. However, now that we have the total number of minutes of activity that took place in a single time frame it's easy enough to go to the next step.

This next step is to gather data for each hour that took place during our target day.

If we want to see which hour was the busiest, we need to measure each time frame independently

and accumulate the results in a table.

This script (which could be a stored procedure) computes the number of minutes of activity

that took place for each process in each hour between 9:00 AM and 5:00 PM.

DECLARE
        @BEGINTIME DATETIME,
        @ENDTIME DATETIME
SET @BEGINTIME='7/1/07 9:00 AM'
SET @ENDTIME='7/1/07 5:00 PM'
DECLARE @DAYHOURS TABLE (
            [DESCRIPTION] VARCHAR(50),
            BEGINDATE SMALLDATETIME,
            IBEGINHOUR INT,
            IDURATION INT)
DECLARE @CTIME DATETIME
SET @CTIME=@BEGINTIME
WHILE @CTIME < @ENDTIME
BEGIN
       ---EVENTS THAT START BEFORE THE CURRENT TIME FRAME AND END DURING OR AFTER IT.
       INSERT INTO @DAYHOURS ([DESCRIPTION],[BEGINDATE],[IBEGINHOUR],[IDURATION])
       SELECT
               A.[DESCRIPTION],
               @CTIME,
               DATEPART(HH,@CTIME),
               DATEDIFF(MI,DBO.DATEMAX(A.[STARTING_TIME],@CTIME),DBO.DATEMIN(A.[ENDING_TIME],DATEADD(MINUTE,59,@CTIME)))
       FROM
               [MY_EVENTS] A
       WHERE
               A.[STARTING_TIME] < @CTIME AND A.[ENDING_TIME] > @CTIME
       ---EVENTS THAT START DURING CURRENT TIME FRAME AND END AFTER IT
       INSERT INTO @DAYHOURS ([DESCRIPTION],[BEGINDATE],[IBEGINHOUR],[IDURATION])
       SELECT
               A.[DESCRIPTION],
               A.[STARTING_TIME],
               DATEPART(HOUR,A.[STARTING_TIME]),
               DATEDIFF(MI,A.[STARTING_TIME],DBO.DATEMIN(A.[ENDING_TIME],DATEADD(MI,59,@CTIME)))
       FROM
               [MY_EVENTS] A
       WHERE
               A.[STARTING_TIME] >= @CTIME
   and A.[STARTING_TIME] < DATEADD(MI,60,@CTIME)
       SET @CTIME=DATEADD(HH,1,@CTIME)
END
SELECT
*
FROM
@DAYHOURS
ORDER BY
[BEGINDATE]

If we execute the script, our results look like this:

DESCRIPTION                                        BEGINDATE               IBEGINHOUR  IDURATION
-------------------------------------------------- ----------------------- ----------- -----------
james bond                                         2007-07-01 10:30:00     10          29
sally may                                          2007-07-01 11:30:00     11          29
jerry jones                                        2007-07-01 11:30:00     11          29
sally may                                          2007-07-01 12:00:00     12          59
jerry jones                                        2007-07-01 12:00:00     12          30
Mike Moxie                                         2007-07-01 12:30:00     12          29
Carl Cluff                                         2007-07-01 12:30:00     12          29
sally may                                          2007-07-01 13:00:00     13          30
Carl Cluff                                         2007-07-01 13:00:00     13          59
Mad Max                                            2007-07-01 14:00:00     14          30

The last step is to calculate the maximum hourly activity and to query which processes were taking place during that hour. This can be achieved by appending the following queries to the script above.

DECLARE @MAXHOUR int
SET @MAXHOUR=(
SELECT TOP 1
IBEGINHOUR
FROM
@DAYHOURS
GROUP BY
IBEGINHOUR
ORDER BY
SUM(IDURATION) DESC)
SELECT @MAXHOUR 'Busiest Hour'
SELECT
*
FROM
@DAYHOURS
WHERE
IBEGINHOUR=@MAXHOUR

Now when we run the script, we get the answer to the original questions. The busiest hour is 12:00 pm - 1:00 pm and there were four processes active during this period.

SELECT TOP 1
Busiest Hour
------------
12

DESCRIPTION                                        BEGINDATE               IBEGINHOUR  IDURATION
-------------------------------------------------- ----------------------- ----------- -----------
sally may                                          2007-07-01 12:00:00     12          59
jerry jones                                        2007-07-01 12:00:00     12          30
Mike Moxie                                         2007-07-01 12:30:00     12          29
Carl Cluff                                         2007-07-01 12:30:00     12          29

With a little bit of modification we could use this same script to find the times with the least activity

or to calculate the average amount of un-utilized time in a day. I find that working with process data can

be confusing and complicated at times. I hope that this article provides a good starting point for your

analysis of process data.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating