RE: The Numbers Table – SQLServerCentral

SSC Guru

Points: 442467

November 27, 2008 at 9:58 am

jacroberts (11/27/2008)
timothyawiseman (11/27/2008)
jacroberts (11/25/2008)
timothyawiseman (11/24/2008)[hr
>>The problem is that your example did not show a case where efficiency is not a concern.
>>But, in your example you say there is no need to worry about the efficiency of a query that runs with no user waiting, but I disagree.
If you had read to the end of the sentence I wrote you would have seen that efficiency is not a concern in this example. So I repeat:
If you have a daily report that is run in batch, with no user waiting at the end of a button for the result, it doesn't matter if the SQL takes 3.2 seconds instead of 3.1 seconds to run.
Yes, I know what you said, but I respectfully disagree.
Once again, at that point in time, the developer-time required to optimize it may simply be a higher price than it is worth, but it will always matter how long it takes to run.
When I was newer to programming, I thought there were times when it would never matter, so I often didn't worry about it. Then the datasets would grow and small inefficiencies would become magnified until they mattered a great deal.
And just because its normally a nightly report that runs when no one is around, as the company grows you may find that time when no one is around begins to shrink quickly. Also, there may be times when management decides they need a refresh with the latest data in the middle of the day.
I have written procedures meant to run in the middle of the night only, but then later had someone rushing up saying they needed it produced now (in my case, it was mostly copying data from production to a reporting server).
And even if you genuinely know for whatever reason that the length of time that the procedure takes to run will actually never matter, remember that humans are creatures of habit. If a programmer is in the habit of writing efficient code, they will do it even when they are not trying to, but if they are not they may find it a very hard thing to do even when they have to.
There are times when other concerns, such as developer-time or robustness with more validation and error correction, outweigh and override the drive for efficiency, but I believe that it is always a concern.
Ok, I have recently written some code that uses an RBAR TVF to increase the efficiency and speed of a query.
Here's my example:
We had a table of dates on the system called RefDates that reports use in their SQL. The table RefDates contained every day in a 5 year period or approximately 2,000 rows. We had another table of events that had to be reported against; approximately 1 event was generated every minute. The report was generated for a period of 1 month (31 days max). The query was very slow as there were a lot of rows in the RefDates lookup table.
A typical report had the following line:
RIGHT JOIN RefDates C
ON B.Date = C.Date
WHERE C.Date >= @StartDateTime
AND C.Date < @EndDateTime
Instead I put a RBAR table valued function:
CREATE FUNCTION [dbo].[GenRefDates]
(
@StartDate datetime,
@EndDate datetime
)
RETURNS @table TABLE (Date datetime)
AS BEGIN
DECLARE @tmpDate datetime
SET @tmpDate = Convert(varchar, @StartDate, 112)
SET @EndDate = Convert(varchar, @EndDate, 112)
WHILE @tmpDate <= @EndDate
BEGIN
INSERT INTO @table VALUES (@tmpDate)
SET @tmpDate = DateAdd(dd, 1, @tmpDate)
END
RETURN
END
The query was then changed to:
RIGHT JOIN GenRefDates(@StartDateTime, @EndDateTime) C
ON B.Date = C.Date
This changed the run time of the query from about a minute to a few seconds. So this shows that a RBAR table valued function can be much more efficient than using a lookup ref table. BTW the time taken to generate the TVF with 31 days is a few micro seconds and is totally insignificant compared to the rest of the query.
So there is an example where an RBAR TVF is much more efficient than a lookup table.

This is true, for a small number of rows. Your RBAR TVF will become ineffecient should requirements change and the number of rows it has to generate becomes quite large. This is what we all have been talking about in this thread. Try it, see how long your RBAR TVF takes to generate 10,000 rows versus the same function written in a set-based manner.