# Fun with business days, calendar tables, and test-driven development

By Alex Kuznetsov,

When we need to find the nearest business day, or to calculate number of business days between two dates, and when we need a simple solution, then UDFs using calendar tables fit the bill perfectly. In this article we shall discuss how to:

• Build a calendar table
• Find the nearest business day
• Calculate number of business days between two dates

To ensure good performance and code reuse, our solutions will be wrapped as inline UDFs. Also we shall see a little bit of test-driven development in action, as we shall be developing these solutions in little increments.

### Building a calendar table

In this section we shall create and populate a calendar table that stores all days, both business and non-business ones (the reasons why we need to store all days rather than only non-business ones will be discussed later). To populate such a table in a set-based way, in the way we typically do it in real life, we need a numbers table - a table with consecutive numbers. In case you don't have it yet, let us create a numbers table from scratch:

```CREATE TABLE dbo.Numbers ( n INT NOT NULL CONSTRAINT PK_Numbers PRIMARY KEY )GOINSERT INTO dbo.Numbers ( n ) VALUES ( 1 ) ;DECLARE @i INT ; SET @i = 1;WHILE @i < 1024 BEGIN INSERT INTO dbo.Numbers ( n ) SELECT n + @i FROM dbo.Numbers ; SET @i = @i * 2 ; END ;
```

Once we have a numbers table, we can go ahead and build a calendar one, as follows:

```CREATE TABLE dbo.Calendar ( [Date] DATE NOT NULL CONSTRAINT PK_Calendar PRIMARY KEY , IsBusinessDate CHAR(1) NOT NULL CONSTRAINT CHK_Calendar_IsBusinessDate CHECK ( IsBusinessDate IN ( 'Y', 'N' ) )  ) ;GO
INSERT INTO dbo.Calendar ( Date , IsBusinessDate  ) SELECT DATEADD(DAY, n, '20091231') AS [Date] , CASE WHEN DATEPART(weekday, DATEADD(DAY, n, '20091231')) IN ( 1, 7 ) THEN 'N' ELSE 'Y' END AS IsBusinessDate FROM dbo.Numbers
```

Also we need to mark holidays such as Easter and Independence Day in our calendar, as follows:

```UPDATE dbo.CalendarSET IsBusinessDate = 'N'WHERE [Date] IN ( '20100404', -- Easter '20100705' ) -- day after Independence

```

Now that the calendar table is built, let us utilize it.

### Finding the business day on or after the given one

To make sure our code is easy to reuse, we shall create a user defined function. To make sure the function performs well, we shall create an inline one. For a detailed explanation of performace differences between scalar and inline UDFs, refer to Chapter 5 of my book.
Before writing any code, let us come up with a few tests:

```SELECT '20100827' AS Expected , [Date]FROM dbo.NearestBusinessDayOnOrBefore('20100827')
SELECT '20100830' AS Expected , [Date]FROM dbo.NearestBusinessDayOnOrBefore('20100828')
SELECT '20100830' AS Expected , [Date]FROM dbo.NearestBusinessDayOnOrBefore('20100829')
SELECT '20100830' AS Expected , [Date]FROM dbo.NearestBusinessDayOnOrBefore('20100830')

```

Of course, these tests are manual, not automated. We shall not discuss automatd and unit tests here. Although some of us already use automated tests, it is not a commonly accepted practice yet. Also at the time of this writing different teams use very different approaches for database testing, and I don't know of any widely accepted and convenient one. Discussing which approach to use for automated tests, and which tests to incorporate in the test harness, may be very interesting, but it is beyond the scope of this article.

Once we have the tests which describe the interface and expected results, let us develop just enough code to pass those tests:

```CREATE FUNCTION dbo.NearestBusinessDayOnOrBefore ( @Date DATE )RETURNS TABLEAS RETURN ( SELECT TOP 1 [Date] FROM dbo.Calendar WHERE [Date] <= @Date AND IsBusinessDate = 'Y' ORDER BY [Date] DESC )
```

As soon as we have the function, we can run the tests against it. With all the tests passing, we could just move on to another task, but, as good defensive programmers, we need to think of cases of unintended use. We don't need to think too hard to come up with such a case. Note that the earliest date in our calendar table is Jan 1st, 2010, and the last one is Oct 20th, 2012. This means that we don't really know what are the non-business days after that date. If we call our function for any date later than Oct 20th, 2012, what result should it return?
Suppose that we have clarified the requirements, and the expected result is NULL. The following test documents this requirement; and it fails.

```DECLARE @d DATE;SELECT @d = DATEADD(DAY,1,MAX([Date])) FROM dbo.Calendar ;SELECT NULL AS Expected , [Date]FROM dbo.NearestBusinessDayOnOrBefore(@d)
```

Let us add some more code, so that this new test passes as well as the old ones:

```ALTER FUNCTION dbo.NearestBusinessDayOnOrBefore ( @Date DATE )RETURNS TABLEAS RETURN ( SELECT TOP 1 CASE WHEN EXISTS ( SELECT * FROM dbo.Calendar WHERE [Date] = @Date ) THEN [Date] END AS [Date] FROM dbo.Calendar WHERE [Date] <= @Date AND IsBusinessDate = 'Y' ORDER BY [Date] DESC )
```

The example that we have just considered demonstrates why we have chosen to store all days in our table, rather than store only holidays, as is frequnelty suggested. The reason is simple: if store only holidays, we cannot distinguish between the situations when we know that there are no holidays and the situations when we don't know what the holidays are.

Note: in real life we should consider incorporating at least some of the tests used during this session of test-driven development in our automated test harness.

Also in real life we may decide that we shall build our calendar table for a really wide range of dates, and we don't care what is returned if the argument it outside that wide range. Under that assumption we may choose to use the previous, the simpler and more performant, version of our function, the one without EXISTS clause. This is what I actually use in my system.

### Finding number of business days between two days.

As in the previous section, we shall use test-driven development to come up with a solution. This means that some tests must be developed first, as follows:

```-- 0 business days between end of Friday and end of next SundaySELECT 0 AS Expected , NumDaysFROM dbo.BusinessDaysBetweenEndsOfDays ('20100827', '20100829')
-- 5 business days between two consecutive Wednesdays, without holidays between themSELECT 5 AS Expected , NumDaysFROM dbo.BusinessDaysBetweenEndsOfDays ('20100818', '20100825')
-- 0 business days if both parameters are the sameSELECT 0 AS Expected , NumDaysFROM dbo.BusinessDaysBetweenEndsOfDays ('20100827', '20100827')
```

Note: the "between ends of days" requirement needs clarification. How many business days are between Friday and the next Saturday? It depends - there is one business day between early Friday morning and early Saturday morning, but there are zero business days between late Friday night and late Saturday night. The "between ends of days" requirement means that we are implementing the second approach, the number of business days between the times late at night on given days.

Let us implement just enough code to meet these requirements, to pass these tests:

```CREATE FUNCTION dbo.BusinessDaysBetweenEndsOfDays ( @DateFrom DATE , @DateTo DATE  )RETURNS TABLEAS RETURN ( SELECT COUNT(*) AS NumDays FROM dbo.Calendar WHERE [Date] > @DateFrom AND [Date] <= @DateTo AND IsBusinessDate = 'Y' )
```

This code passes our tests, but we have more requirements. As in the previous section, we may need to address the case when the calendar table does not store all the days between DateFrom and DateTo. Of course, we might choose to not handle this case, just like we did in the very end of the previous section. However, let us suppose that we have decided to address this case.

Before adding more tests, and before altering the function to pass them, let me explain why we don't lay out all the tests at once, and why we don't develop all the code at once. In my experience, we are more efficient when we develop code in reasonalbly small increments. If we are trying to implement all the requirements at once, we usually spend more time on development. This is especially true when we have a lot of distractions and need to multitask - in such cases we want to complete a step between distractions and context switching.

Of course, for developers with different skill levels and different experience with the subject area the optimal step size may be very different. So, if these steps are too small for you, just take bigger ones.

Getting back to development, the following tests all expect NULL, because one or both ends of the range are beyond the range of known dates, and these tests do not pass yet:

```-- NULL if at least one parameter does not have a row in Calendar tableSELECT NULL AS Expected , NumDaysFROM dbo.BusinessDaysBetweenEndsOfDays('20010827', '20100817')
SELECT NULL AS Expected , NumDaysFROM dbo.BusinessDaysBetweenEndsOfDays('20100827', '20990817')
SELECT NULL AS Expected , NumDaysFROM dbo.BusinessDaysBetweenEndsOfDays('20010827', '20990817')
```

Modifying the function to pass these tests is easy:

```ALTER FUNCTION dbo.BusinessDaysBetweenEndsOfDays ( @DateFrom DATE , @DateTo DATE  )RETURNS TABLEAS RETURN ( SELECT CASE WHEN EXISTS ( SELECT * FROM dbo.Calendar WHERE [Date] = @DateFrom ) AND EXISTS ( SELECT * FROM dbo.Calendar WHERE [Date] = @DateTo ) THEN COUNT(*) END AS NumDays FROM dbo.Calendar WHERE [Date] > @DateFrom AND [Date] <= @DateTo AND IsBusinessDate = 'Y' )
```

So far so good, but what shall we do if DateTo is earlier than DateFrom? The current version of the function returns either null or a positive number. If the requirements are to return negative number in such cases, we would need one more iteration of adding one more test and altering the function to pass it.

### What can go wrong with our solution?

As careful and experienced developers, we must not rush to deploy as soon as all our tests have passed. We need to think of cases of unintended use; we need to spend some time trying to break our own solutions. In this case, breaking our solution is very easy. All our code assumes that our calendar table stores dates without gaps. If someone occasionally deletes rows from the middle of the range of the calendar table, we shall start getting incorrect results.

Can we make sure that our calendar table has no gaps? Definitely: a couple of computed columns and constraint should take care of that. This is doable, not too complex, and we shall discuss it in some later article.

Do we really have to do that? It depends. If we are developing a solution that is going to last, then in the long run it is usually easier to provide a foolproof solution from the very beginning than to go through an expensive cycle of troubleshooting, fixing, testing, and redeploying.

Total article views: 7184 | Views in the last 30 days: 1

Related Articles
FORUM

### select between left & right...

select between words?

FORUM

### Interesting SELECT

Select between two smalldatetime fields

ARTICLE

### Calendar Tables

Learn how to avoid complex date calculations and increase performance using calendar tables in this ...

FORUM

Calendar

ARTICLE

### Holidays and Calendar Generation

Create the definition of your holidays and have them available for any year. Create a Calendar incl...

Tags
 defensive programming test-driven development t-sql udf