RE: Please vote for the "built in Tally Table" function.

SSC Guru

Points: 1003894

November 28, 2014 at 5:25 pm

CELKO (11/28/2014)
I have mixed feeling on this. It sounds like a schema level table constant with contiguous exact numeric values. Nobody can change a constant. Why would it start at one and not zero? (I have an answer for that, based on sets, but I digress).
But would that be a CREATE SEQUENCE with extra options? What about a CALENDAR constant table? How do you get INCITS to agree?
Right now,I just load these two tables ever time I build a new schema.

Good questions, Joe.

A part of the problem is that if you have a fixed size table, it could be either too big or too small. Being that disk space is cheap, you might think that "too big" would be the way to go and if disk were all that you needed to worry about, then "too big" would fit the bill and it has for you, me, and thousands of DBAs and Developers for decades.

In order for such a table to be used, it actually has to live in memory. But, considering the size of the table, even that wouldn't be a problem because (for example) a million row table of integers would only require 4 million bytes plus a little for the B-Tree of the Clustered Index. That's almost nothing for today's machines.

Then, there's the problem of logical READs. A physical TALLY table can cause a huge number of reads especially if there are a lot of functions created that use the Tally table for every row via a CROSS APPLY or similar. These particular reads (although each READ is 8,192 bytes) aren't bad READs but they do tend to obfuscate bigger problems that are frequently detected by finding the queries with the largest number of reads over time. Even then, you could use Itzik's wonderful cascading CTE method which returns the desired numbers in a totally read-less fashion.

So why the request for a built-in function to do this?

One reason is, of course, simple convenience. I'm sure that you've worked in shops where it almost takes an "Act of Congress" to get permission to add such "helper" tables and functions to a database or, worse yet, a shop that insists that such "helper" tables and functions live in every database where they are needed so as to not make the database dependent on a "utility" database. Rebuilding/copying Itzik's wonder cascading CTE into every nook and cranny that needs it, if nothing else, is a pain.

But the big reason (if they do it right) is one of my favorites... blinding speed. And not just for sequences that start at 0 or 1. If they set it up to take a start and end value, you would no longer have to add offsets to the sequence to get the values that you actually need. That even lends itself well to on-the-fly creation of date sequences.

In an intangible area of performance, a new function in SQL Server would get a whole lot more attention than what the "Tally" or "Numbers" table currently has even though it's been around since the '60s because everyone wants to use the next new shinny object in SQL Server. It might even stop people from recommending the use of recursive CTEs that count and {gasp} splitters based on WHILE loops.

If Microsoft does it right (doesn't manage to kill performance like they have with so many things) and if they document it correctly, a lot of people who don't even know how to spell "Tally Table" will start to use it.

Of course, this would be the tip of the proverbial ice-berg. MS really should build a decent splitter function so people can stop killing themselves with trying to normalize junk inputs and a decent date/time sequence generator.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)