SQLServerCentral Article

Useful Dates: The Many Uses of Date Tables

,

Often times it is necessary to display or join to a range of dates. There are many ways to do this. You could use a while loop to generate the dates every time you need one. But while loops are often slow and consume valuable resources. You could use a numbers table (see util_nums article) to generate dates on the fly which will be faster, but still requires using SQL Server functions to do date calculations.

What is a dates table?

A dates table is just that, a table that contains dates. But what kind. We'll go over some of the possible uses of a dates table, but first let's talk about the structure. And just like with the util_nums table, we will create the table, populate it, then apply the indexes.

createtable [dbo].[util_time_coordinate]
     (date_id       int not null identity
     ,theDimension  varchar(7) not null
     ,theCoordinate datetime not null
     )
 go
 alter table [dbo].[util_time_coordinate] add constraint PK_util_dates primary key nonclustered (date_id)with fillfactor = 100
 create unique clustered index uidx_util_dates_dimension_theDate on [dbo].[util_time_coordinate] (theDimension,theCoordinate) with fillfactor = 100

Pretty simple huh! But let's break it down.

  • date_id: This is just a surrogate key in case you find it useful to reference this date table later for aggregation. Believe me, there are uses for it.
  • theDimension: This is the datepart. Later we will be populating this with the following values: Year, Quarter, Month, Day and Hour.
  • theCoordinate: This is where the actual time coordinate will be stored. You will be able to see sample data below.

Populating the table

Ok. Before we get into HOW to use it, we better get it populated. We will be using the util_nums table to generate this data. But we will need more than 1 million records of we want to create a large date coordinate span. For our example, we are going to create all coordinates for each of the dimensions between January 1, 1900 to December 31, 2099. Seems like overkill right? Perhaps. You will need to determine what your @BaseDate and @StopDate values should be. As for storage, this one is going to be a little more costly than the util_nums table. It requires approximately 500KB per year, for every year you wish to create in this table. So the 200 years we will be creating will require 100MB (including index and data).

The below code will generate the following record counts per dimension.

 theDimension          Count
 ------------          -----------
 HOUR                  1753176
 DAY                   73049
 MONTH                 2400
 QUARTER               800
 YEAR                  200

Different population options were explored for the util_nums table. But there really is no point here since we this entire process takes only 45 seconds *1.

 set nocount on
 declare    @BaseDate    datetime
         ,@StopDate     datetime
 set@BaseDate = converT(datetime,'1900.01.01',102)
 set@StopDate = converT(datetime,'2100.01.01',102)
 declare    @dateParts TABLE
     (Id            int
     ,theDatePart   varchar(7)
     )
 insert into @DateParts values(1,'YEAR')
 insert into @DateParts values(2,'QUARTER')
 insert into @DateParts values(3,'MONTH')
 insert into @DateParts values(4,'DAY')
 insert into @DateParts values(5,'HOUR')
 declare    @ID          int
         ,@theDatePart  varchar(7)
         ,@parmlist     nvarchar(1000)
         ,@SQL          nvarchar(1000)
 select@ID = min(ID)
 from@DateParts
 while@ID is not null
 begin
     select @theDatePart = theDatePart from @dateparts where id = @id
 set@parmlist =
 N'@basedate    datetime
 , @StopDate     datetime
 , @thedatepart varchar(7)'
     set @SQL = N'select  @theDatePart, @baseDate
     union all
     select @theDatePart, dateadd(' + @theDatePart +', n, @baseDate)
     from [dbo].[util_nums]
     where n < datediff(' + @theDatePart + ', @baseDate,@StopDate) '
         insert into [dbo].[util_time_coordinate] (theDimension,theCoordinate)
         exec sp_executeSQL @sql, @parmlist, @basedate=@basedate, @theDatePart=@theDatePart, @StopDate = @StopDate
     select top 1 @id = id from @dateparts where id > @id
     if @@rowcount=0
         break
 end
 GO

Now let's go over some of the possible uses.

Time tracking:

If you have an employee time table, how would you find a list of dates the employee did NOT work? You can loop through, distinct date by date, OR:

create table #employees
     (employee_id   int
     ,employee_name varchar(32)
     )
go
insert into #employees values(1,'Brandon Galderisi')
insert into #employees values(2,'Donald Duck')
insert into #employees values(3,'Mickey Mouse')
go
create table #EmployeeTime
     (employee_id   int
     ,work_date     datetime
     ,StartDatetime datetime
     ,endDateTime   datetime
     )
GO
insert into #EmployeeTime select 1,convert(datetime,convert(char(10),getdate(),101),101),getdate(),dateadd(hh,8,getdate())
insert into #EmployeeTime select 1,convert(datetime,convert(char(10),getdate()-1,101),101),getdate()-1,dateadd(hh,8,getdate()-1)
insert into #EmployeeTime select 1,convert(datetime,convert(char(10),getdate()-3,101),101),getdate()-3,dateadd(hh,8,getdate()-3)
insert into #EmployeeTime select 2,convert(datetime,convert(char(10),getdate(),101),101),getdate(),dateadd(hh,8,getdate())
insert into #EmployeeTime select 2,convert(datetime,convert(char(10),getdate()-1,101),101),getdate()-1,dateadd(hh,8,getdate()-1)
insert into #EmployeeTime select 3,convert(datetime,convert(char(10),getdate()-3,101),101),getdate()-3,dateadd(hh,8,getdate()-3)
GO
;with Employee_Days as
(select e.employee_id,t.thecoordinate from #employees e
cross join [dbo].[util_time_coordinate] t
where t.theDimension = 'DAY'
  and theCoordinate >= getdate()-14
  and theCoordinate <= getdate())
select e.*
     ,convert(char(5),dateadd(n,isnull(datediff(n, startdatetime,enddatetime),0),0),108) Time_Worked
fromEmployee_Days e
 left outerjoin #EmployeeTime t
   on e.employee_id = t.employee_id
   and e.theCoordinate = t.work_date
order by e.employee_id,e.thecoordinate
GO
drop table #EmployeeTime
drop table #employees

Let's break it down!

The first 4 sections are simple. We are creating some sample tables and populating some data.

Next: We use a cross join to map all employees to all dates for the range we want to check for. Notice the use of the t.theDimension. Since certain coordinates will have multiple dimensions, we need to specify what dimension we want. I've represented it as a common table expression for making it's reference simpler in the following section.

;with Employee_Days as
(select e.employee_id,t.thecoordinate from #employees e
cross join [dbo].[util_time_coordinate] t
where t.theDimension = 'DAY'
  and theCoordinate >= getdate()-14
  and theCoordinate <= getdate())

As I just explained, the Employee_Days is a mapping of all employees to all days for the time specified. This will ensure that we can represent days where data does not exist.

select e.*
     ,convert(char(5),dateadd(n,isnull(datediff(n, startdatetime,enddatetime),0),0),108) Time_Worked
fromEmployee_Days e
 left outerjoin #EmployeeTime t
   on e.employee_id = t.employee_id
   and e.theCoordinate = t.work_date
order by e.employee_id,e.thecoordinate

The output:

employee_id    thecoordinate                         Time_Worked
-----------    -----------------------               -----------
1              2008-09-25 00:00:00.000               00:00
..edited
1              2008-10-04 00:00:00.000               00:00
1              2008-10-05 00:00:00.000               08:00
1              2008-10-06 00:00:00.000               00:00
1              2008-10-07 00:00:00.000               08:00
1              2008-10-08 00:00:00.000               08:00
2              2008-09-25 00:00:00.000               00:00
..edited
2              2008-10-06 00:00:00.000               00:00
2              2008-10-07 00:00:00.000               08:00
2              2008-10-08 00:00:00.000               08:00
3              2008-09-25 00:00:00.000               00:00
3              2008-09-26 00:00:00.000               00:00
..edited
3              2008-10-05 00:00:00.000               08:00
3              2008-10-06 00:00:00.000               00:00
3              2008-10-07 00:00:00.000               00:00
3              2008-10-08 00:00:00.000               00:00

I removed the output, but the sample set included 42 records. Which is one for each employee for each of the 14 days requested.

Time Scheduling:

The same principal to time tracking can be applied to time scheduling. You can count the number of people scheduled to work on a given day, and represent those days that have 0 people scheduled.

What's next:

The next topic I will cover is string parsing and manipulation.

If you would like any additional information about this or to request a future topic, please feel free to email me at Brandon@Galderisi.com.

*1 – All times may vary.

Test Machine:

XEON 3060
4GB Ram (500MB allocted to SQL)
SQL Server 2005 Express Edition

Rate

2.35 (26)

You rated this post out of 5. Change rating

Share

Share

Rate

2.35 (26)

You rated this post out of 5. Change rating