Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««1234»»

Fun with business days, calendar tables, and test-driven development Expand / Collapse
Author
Message
Posted Saturday, September 25, 2010 2:32 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, November 22, 2010 6:53 PM
Points: 7, Visits: 49
I really want to encourage people to automate these type of tests. I know that you said in the article about automating tests that “it is not a commonly accepted practice yet” and I think that may be because of the overhead involved by some of the automation frameworks. Here is a solution that is automated yet it is as simple to write as your manual tests. Once you've installed the TST database (http://tst.codeplex.com, once downloaded it takes about 20 seconds to install and it is all T-SQL) all you’d have to do for your case is to write the tests procedures below. Then you can execute the tests in command prompt by running:
"TST /RunAll DatabaseName"
or in SQL by running
"EXEC TST.Runner.RunAll DatabaseName"

CREATE PROCEDURE dbo.SQLTest_NearestBusinessDayOnOrBefore
AS BEGIN

DECLARE @ActualResult DATETIME

SELECT @ActualResult = Date FROM dbo.NearestBusinessDayOnOrBefore('20100827')
EXEC TST.Assert.Equals 'NearestBusinessDayOnOrBefore(''20100827'')', {d '2010-08-27'}, @ActualResult

SELECT @ActualResult = Date FROM dbo.NearestBusinessDayOnOrBefore('20100828')
EXEC TST.Assert.Equals 'NearestBusinessDayOnOrBefore(''20100828'')', {d '2010-08-30'}, @ActualResult

SELECT @ActualResult = Date FROM dbo.NearestBusinessDayOnOrBefore('20100829')
EXEC TST.Assert.Equals 'NearestBusinessDayOnOrBefore(''20100828'')', {d '2010-08-30'}, @ActualResult

SELECT @ActualResult = Date FROM dbo.NearestBusinessDayOnOrBefore('20100830')
EXEC TST.Assert.Equals 'NearestBusinessDayOnOrBefore(''20100828'')', {d '2010-08-30'}, @ActualResult

END
GO

CREATE PROCEDURE dbo.SQLTest_NearestBusinessDayOnOrBeforeNull
AS BEGIN

DECLARE @ActualResult DATETIME

SELECT @ActualResult = Date FROM dbo.NearestBusinessDayOnOrBefore(NULL)
EXEC TST.Assert.IsNull 'NearestBusinessDayOnOrBefore(NULL) should return NULL', @ActualResult

END
GO

CREATE PROCEDURE dbo.SQLTest_BusinessDaysBetweenEndsOfDaysOutOfRange
AS
BEGIN

DECLARE @ActualResult DATETIME

SELECT @ActualResult = NumDays FROM dbo.BusinessDaysBetweenEndsOfDays('20010827', '20100817')
EXEC TST.Assert.IsNull 'BusinessDaysBetweenEndsOfDays(''20010827'', ''20100817'') should return NULL', @ActualResult

SELECT @ActualResult = NumDays FROM dbo.BusinessDaysBetweenEndsOfDays('20100827', '20990817')
EXEC TST.Assert.IsNull 'BusinessDaysBetweenEndsOfDays(''20100827'', ''20990817'') should return NULL', @ActualResult

SELECT @ActualResult = NumDays FROM dbo.BusinessDaysBetweenEndsOfDays('20010827', '20990817')
EXEC TST.Assert.IsNull 'BusinessDaysBetweenEndsOfDays(''20010827'', ''20990817'') should return NULL', @ActualResult

END
GO

Post #993217
Posted Saturday, September 25, 2010 3:08 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, May 24, 2014 7:58 PM
Points: 219, Visits: 824
Ladi,

This is a interesting suggestion, a response I have been waiting for. I know about your framework, I read your article and commented on it. What you are suggesting is certainly worth doing. What we are doing in C# is also a working approach.

Yet I have another suggestion. Once we are done with manual runs, once we are completely satisfied with the results, we can just use a simple tool to save expected results in Json or XML. Because we frequently have multiple result sets, we better do it in C# - T-SQL is incapable of capturing multiple result sets. Generating expected results is much faster than having to write unit tests.

Also when expected results are decoupled from the test, it simplifies maintenance.

What do you think?
Post #993299
Posted Sunday, September 26, 2010 9:10 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, July 27, 2014 4:54 AM
Points: 910, Visits: 840
Thank you Alvin, your remark together with the link from wildt about
Populating a Kimball Date Dimension
are what I am looking for.
Post #993371
Posted Monday, September 27, 2010 2:34 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 2:44 PM
Points: 1,945, Visits: 2,862
Alexander Kuznetsov (9/24/2010)
CELKO (9/23/2010)
I still prefer Julianizing the business days to make the math easier

CREATE TABLE Calendar
(cal_date DATE NOT NUL PRIMARY KEY,
julian_business_nbr INTEGER NOT NULL,
...);
INSERT INTO Calendar
VALUES ('2007-04-05', 42),
('2007-04-06', 43), -- good friday
('2007-04-07', 43),
('2007-04-08', 43), -- Easter sunday
('2007-04-09', 44),
('2007-04-10', 45); --Tuesday

To compute the business days from Thursday of this week to next
Tuesday:

SELECT (C2. julian_business_nbr - C1. julian_business_nbr)
FROM Calendar AS C1, Calendar AS C2
WHERE C1.cal_date = '2007-04-05',
AND C2cal_date = '2007-04-10';


Yes, we do it too. When a new holiday comes up, like President Ford's funeral, we update all those future numbers.


The updating is not much of a problem:

1) New holidays do not occur very often, per one or two per decade at most. But queries about business days occur very frequently. Think about expected delivery dates which are quoted in business days. How many times a day do you think Amazon or Netflix uses an expected delivery date computation or look-up? So design the schema for the most common case.

2) The julianized date column will not be indexed, so it gets updated in place without any re-indexing overhead. If you want to save space, use a SMALLINT (32,767 = 89 years). The DATE data type is only 3 bytes now. Even if I have other columns, I should have short rows and a low percentage of free space on the data pages.

3) By indexing on the calendar date in descending order, we can keep the current time frame in cache or main storage, so the julianized date math will be done without much disk access. The math is straight integer subtraction, far faster than calling a temporal procedure.


Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #994027
Posted Wednesday, September 29, 2010 1:16 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, March 12, 2014 8:57 PM
Points: 3, Visits: 38
Thanks for a simple yet effective tool.
One thing was not clear = why all those functions are declared to return TABLE? it looks to me these funcs return just a date. What am I missing?
Post #994958
Posted Sunday, October 3, 2010 12:52 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, May 24, 2014 7:58 PM
Points: 219, Visits: 824
CELKO (9/27/2010)
Alexander Kuznetsov (9/24/2010)
CELKO (9/23/2010)
I still prefer Julianizing the business days to make the math easier

CREATE TABLE Calendar
(cal_date DATE NOT NUL PRIMARY KEY,
julian_business_nbr INTEGER NOT NULL,
...);
INSERT INTO Calendar
VALUES ('2007-04-05', 42),
('2007-04-06', 43), -- good friday
('2007-04-07', 43),
('2007-04-08', 43), -- Easter sunday
('2007-04-09', 44),
('2007-04-10', 45); --Tuesday

To compute the business days from Thursday of this week to next
Tuesday:

SELECT (C2. julian_business_nbr - C1. julian_business_nbr)
FROM Calendar AS C1, Calendar AS C2
WHERE C1.cal_date = '2007-04-05',
AND C2cal_date = '2007-04-10';


Yes, we do it too. When a new holiday comes up, like President Ford's funeral, we update all those future numbers.


The updating is not much of a problem:

1) New holidays do not occur very often, per one or two per decade at most. But queries about business days occur very frequently.


Joe,

In general, I concur - storing consecutive numbers with business days makes perfect sense. Yet changes in what is a business day occur all the time, especially if we deal with multiple countries, and a business day must be such in all involved countries. Even if we limit the scope to US only, changes do occur. Just recently, Chicago Mercantile Exchange announced how they are going to operate on Columbus Day, and that seems to e a change from previous years.
Post #997313
Posted Sunday, October 3, 2010 12:54 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, May 24, 2014 7:58 PM
Points: 219, Visits: 824
migurus (9/29/2010)
Thanks for a simple yet effective tool.
One thing was not clear = why all those functions are declared to return TABLE? it looks to me these funcs return just a date. What am I missing?


migurus,

I made sure that my functions are inline, so that they perform fast. TABLE keyword if just part of declaration syntax for inline functions.
Post #997314
Posted Tuesday, October 5, 2010 6:53 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, November 22, 2010 6:53 PM
Points: 7, Visits: 49
Alexander Kuznetsov (9/25/2010)
Ladi,

This is a interesting suggestion, a response I have been waiting for. I know about your framework, I read your article and commented on it. What you are suggesting is certainly worth doing. What we are doing in C# is also a working approach.

Yet I have another suggestion. Once we are done with manual runs, once we are completely satisfied with the results, we can just use a simple tool to save expected results in Json or XML. Because we frequently have multiple result sets, we better do it in C# - T-SQL is incapable of capturing multiple result sets. Generating expected results is much faster than having to write unit tests.

Also when expected results are decoupled from the test, it simplifies maintenance.

What do you think?


What you are suggesting is a practical approach. However it goes a little bit against an important principle in unit testing – that you should calculate the expected result yourself and not trust the application to provide a baseline for you. Also in case of TDD that is not really appropriate (or possible).

I could see some scenarios that involve integration testing rather than unit testing where your approach would be OK. For example when you’d want to collect large results and consider them as a baseline to protect against regressions or changes in behavior before a refactoring.

Bringing “better languages” in the test automation has some advantages. SQL is error prone compared with C# so if you have a solid infrastructure written in C# that could allow unit testing of SQL code that may be a better solution than T.S.T.

T.S.T. is addressed to developers or teams who are familiar with T-SQL but do not have the expertise or the environment needed to support other unit test tool that could exercise SQL Code. For these situations T.S.T. is ideal. Your article were written as if for such an environment (the test code was also in SQL) so that is why I thought T.S.T. was an approach worth mentioning.
Post #998889
Posted Tuesday, October 5, 2010 7:17 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, May 24, 2014 7:58 PM
Points: 219, Visits: 824
Ladi,

I think that when we are dealing with T-SQL and the expected results are result sets consisting of rows and columns, with column names and column types, following the principle "that you should calculate the expected result yourself" uses up a huge amount of time. It is much faster to let the tool generate the results for you, if you have a tool you can trust. I do have such a tool, which saves me a huge amount of time. What do you think?
Post #998892
Posted Tuesday, October 5, 2010 10:19 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, November 22, 2010 6:53 PM
Points: 7, Visits: 49
It depends. I find that this is a lot easier to discuss on specific examples and I’d like to analyze a couple.

Let’s take the code that you present in your article. There you have a couple of functions “NearestBusinessDayOnOrBefore” and “BusinessDaysBetweenEndsOfDays”. I had plenty of cases like that in production code where we had to validate UDFs. In that case you should calculate the results yourself and T.S.T. could be an ideal tool to use.

If you are talking about a test scenario where you expect a stored procedure to return hundreds or thousands of rows, then you are most likely talk about an integration test and not a unit test. Let’s say that you have a stored procedure that will return a dataset where every row contains information about the cost accrued per day in the context of some financial dealings. Maybe the issue is complicated by the fact that there are a lot of parameters like: discounts that fluctuate with time, service interruptions, variable level of service provided, holidays, client status, legal aspects and so on. If you will feed to the system scenarios that cover a long period of time and include variations of a lot of these parameters you may get a result that will consist of hundreds and thousands of rows where the results are impossible to calculate manually. In this case TST can still be useful especially if you store the baselines in some tables or if the baseline can be calculated by an alternate set of procedures. However, depending of the test infrastructure and the tools you have it is possible you will want to use something different. Like I said, this type of test is not a unit test – it is an integration test. Traditionally there is a lot of confusion between the two and in my experience a lot of people build integration tests and call them unit tests. Ignoring for a moment the terminology issue, such a way of doing things is a valid one and automating it has value but also has certain implication (most of them around maintenance) that have to be understood. I’ll get back to this in a minute. A unit test will be different. You’ll feed to the system a set of scenarios where only one parameter varies and where the result consists of a very small data set typically 1-10 rows. It is a little bit more to be said about how the code and the unit tests can be written in a way that will assure code coverage in scenarios where multiple parameter vary simultaneously. These types of tests will be appropriate for TST and the expected results will be calculated manually.

Getting back to the case where you do an integration test- that can have great advantages but also has certain downsides.
Advantages:
- You can cover a large number of cases that include complicated combinations of the scenarios parameters. Creating and freezing a baseline can be done very productively in this way.

Downsides:
- This approach suffers from a test matrix explosion. I have seen cases where this was a significant issue. The result was that not all test cases were covered. Also when calculating the expected results there is a real risk to accept erogenous results. Calculating expected results based on some alternate automation is not trivial. Unit tests will not suffer from the matrix explosion issue.
- When a calculation rule changes it can be sometimes difficult to recalculate the expected result. Unit tests are great in this case if that were written correctly in the sense that they have a minimum of overlapping.

I think that both integration tests and unit tests have value but it is important to understand the difference ant their cost implication especially long term related to maintenance in face of changing requirements.

In conclusion I think that T.S.T can give you a very fast start – asks only for SQL expertise and has no other dependencies. It is also very easy to install and very easy to learn for someone who knows SQL. For large systems it may make sense to be combined or even replaced with heavier and more feature rich frameworks.
Post #998930
« Prev Topic | Next Topic »

Add to briefcase «««1234»»

Permissions Expand / Collapse