The "Numbers" or "Tally" Table: What it is and how it replaces a loop

  • Freakin' awesome article, man. I had no idea such things as these existed. I have seen the light. Thank you!

  • Nice algorithm, jcrawf02! I see your point of confusion; let me see if I could shed some light:

    As you know, SELECT #holdMe.value FROM #holdMe returns 47 rows (one character in each row), i.e. all the records in the table.

    You are simply modifiying that slightly when you write SELECT @rebuildMe = @rebuildMe + #holdMe.value FROM #holdMe. This stores the value into the specified variable instead of adding it to a result set. And it does this 47 times, one for each row. Since your assignment is accumulative upon itself (equivalent to x += char), it collects all the characters.

  • Obscenotron (5/22/2008)


    Freakin' awesome article, man. I had no idea such things as these existed. I have seen the light. Thank you!

    Now THAT's an awesome compliment! Thanks!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jcrawf02 (5/22/2008)


    All right Jeff, I was having fun with your tally table example and trying to think set-based on how to clean up a column that included non-alpha characters. I figured it out, but now my pea-brain is stumped as to how the reconstruction of the string is occurring. How does SQL know to piece this back together in its entirety in one SELECT, without having to go through a WHILE loop OR join to a Tally Table? (I left in what I thought I was going to have to do, commented out, which I figured out I didn't need, but am not sure why)

    Heh... First off, I never build a Tally table using a While loop except for demonstrating how slow it can be. If you only do it once, it might be OK... I just never do it that way. I always do it like the following because the code is a lot shorter to type and runs faster to boot...

    --===== Create and populate the Tally table on the fly

    SELECT TOP 11000 --equates to more than 30 years of dates

    IDENTITY(INT,1,1) AS N

    INTO dbo.Tally

    FROM Master.dbo.SysColumns sc1,

    Master.dbo.SysColumns sc2

    --===== Add a Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    --===== Allow the general public to use it

    GRANT SELECT ON dbo.Tally TO PUBLIC

    Other than that, Michael's explanation is spot on... do you have other questions on this?

    Also, no need to split into a #HoldMe table to rebuild at all...

    DECLARE @RebuildMe VARCHAR(8000)

    SELECT @RebuildMe = COALESCE(@RebuildMe,'')+SUBSTRING(mh.TheValue,t.N,1)

    FROM #MyHead mh

    CROSS JOIN #Tally t

    WHERE N <= DATALENGTH(mh.TheValue) --Catch trailing spaces

    AND SUBSTRING(mh.TheValue,t.N,1) NOT LIKE '[^A-Z ]' --Reject everything except letters and spaces

    ORDER BY mh.PK,t.N

    SELECT @RebuildMe

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jcrawf02 (5/22/2008)


    (...) How does SQL know to piece this back together in its entirety in one SELECT, without having to go through a WHILE loop OR join to a Tally Table? (...)

    (...)

    SELECT @rebuildMe = @rebuildMe + #holdMe.value

    FROM #holdMe

    (...)

    In addition to the explanations offered by Michael and Jeff, let me add that this technique is undocumented and not guaranteed. That is, it might work, but it might just as well fail - and that failure might just happen some day after deployment, for instance because the actual workload on the machine triggers the engine to use a different execution plan!

    Usually, this technique works flawlessly - but take care when you start to rely on it. I have once seen a post somewhere (I think Google's usenet archive) with some code that reproducibly caused this technique to fail. I lost the URL however, and was unable to find it again in the time I was willing to spend on searching.

    Addendum 1: A relational purist would even consider the (usual) result of this technique to be a bug. After all, relational databases are supposed to execute queries such that the results are the same as when all rows were really processed at the exact same moment.

    Addendum 2: Up until SQL Server 2000, there was no supported and/or documented method to concatenate values from different rows in a single set-based statement. Since SQL Server 2005, you can use a technique with the "FOR XML PATH" option. I don't have the exact details at hand, but I'm sure Google has.

    Note that this option is actually intended to be used for forming XML strings, and as such will cause funny things to happen on strings containing characters with special meaning in XML: the ampersand, and the greater than and less than signs.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • Jeff Moden (5/22/2008Heh... First off, I never build a Tally table using a While loop except for demonstrating how slow it can be. If you only do it once, it might be OK... I just never do it that way. I always do it like the following because the code is a lot shorter to type and runs faster to boot...

    Yeah, I know, but I have limited access to the system, and can't access the system tables at all, or create indexes. Stuck with the while loop for now. I'm considering just asking the DBA to build a tally table into the db for all to use.

    Other than that, Michael's explanation is spot on... do you have other questions on this?

    No, I think that helps me to understand it better, thinking of it as +=

    Also, no need to split into a #HoldMe table to rebuild at all...

    DECLARE @RebuildMe VARCHAR(8000)

    SELECT @RebuildMe = COALESCE(@RebuildMe,'')+SUBSTRING(mh.TheValue,t.N,1)

    FROM #MyHead mh

    CROSS JOIN #Tally t

    WHERE N <= DATALENGTH(mh.TheValue) --Catch trailing spaces

    AND SUBSTRING(mh.TheValue,t.N,1) NOT LIKE '[^A-Z ]' --Reject everything except letters and spaces

    ORDER BY mh.PK,t.N

    SELECT @RebuildMe

    Very nice! Thanks!

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • Hugo Kornelis (5/23/2008)

    Addendum 1: A relational purist would even consider the (usual) result of this technique to be a bug. After all, relational databases are supposed to execute queries such that the results are the same as when all rows were really processed at the exact same moment.

    I think that's where my head was, thinking that I should have to move through the set one at a time in order to add it to the preceding result.

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • David Jackson (5/22/2008)


    Using your test data, what would you expect this to return?

    DECLARE @rebuildMe varchar(2000), @iteration int

    SET @rebuildMe = ''

    SELECT @rebuildMe = @rebuildMe + cast(n as varchar(10)) + ' '

    FROM #tally

    SELECT @rebuildMe

    Dave J

    yesterday, I would have expected that to return 11000 result sets, each with n+' ' tacked onto the end of the last. Today, I would assume it would return n+' '+(n+1)+' ' . . . etc up to 2000 characters.

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • In case you take Hugo's post to heart...

    Yep... I agree... it ain't ANSI. Oracle (and several other RDBMS's) don't have this ability. Lot's of stuff in this world will break if they ever remove that ability to "overlay" a variable like that from SQL Server. Thanks goodness for the non-ANSI extensions of SQL Server. I'm pretty sure this method works in 2008, as well. 😀

    Just to make everyone feel better... the only time I've seen a failure, such as what Hugo briefly describes, is when someone forgets to put a Clustered Index on the Tally table after it's built or when parallism takes over. Even then, an Order By and a OPTION (MAXDOP1) will take care of that (although the optimizer will basically ignore the Order By if the Clustered Index is present).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (5/23/2008)


    ... or when parallism takes over. Even then, an Order By and a OPTION (MAXDOP1) will take care of that....

    :ermm:

    Just a little reminder of a nice article with the great test concerning this technique.

    It has the to the point title "Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5"

    The full article can be found at http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    The reminder I wanted to point to is this one:

    http://www.sqlservercentral.com/Forums/FindPost450363.aspx

    The caveot is partitioning ! (even the maxdop 1 will not save you from that).

    But except for that, ..... pure lightning :w00t:

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • jcrawf02 (5/23/2008)


    Yeah, I know, but I have limited access to the system, and can't access the system tables at all, or create indexes. Stuck with the while loop for now. I'm considering just asking the DBA to build a tally table into the db for all to use.

    Howdy, thanks again for the feedback.

    Of course, asking the DBA to build a Tally Table would be the best bet. But, you can cheat a bit. The very high speed method of creating a Tally Table using a Cross Join only relies on Cross Joining to a table that's guaranteed to be large enough to produce 11,000 rows in the Cross Join. The table only needs to be SQRT(11000)+1 rows long (SQRT(11000)+1 = 105... not a real big table) so just about any table will work especially one with a clustered index on it. You just have to be guaranteed it holds at least 105 rows to get to 11000 for the tally table.

    Let's see if you have access to some of the local system objects... what do you get when you run the following with no reference to the Master database?

    [font="Courier New"]SELECT COUNT(*) FROM dbo.SysColumns --2000 and 2005[/font]

    ... or ...

    [font="Courier New"]SELECT COUNT(*) FROM sys.Columns --2005 only[/font]

    ... or ...

    [font="Courier New"]SELECT COUNT(*) FROM sys.SysColumns --2005 only[/font]

    Also, which version of SQL Server are you using? There's at least one more "trick" to building a Tally Table on the fly without the use of any table or While loop. They produce slightly slow results than the Cross-Join, but they all beat the tar out of a While loop. Lemme know...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Let's see if you have access to some of the local system objects... what do you get when you run the following with no reference to the Master database?

    [font="Courier New"]SELECT COUNT(*) FROM dbo.SysColumns --2000 and 2005[/font]

    Also, which version of SQL Server are you using? There's at least one more "trick" to building a Tally Table on the fly without the use of any table or While loop. They produce slightly slow results than the Cross-Join, but they all beat the tar out of a While loop. Lemme know...

    Thanks, I should learn that lesson about assuming again. I can access dbo.SysColumns, what I can't get to is INFORMATION_SCHEMA, or run sp_help

    And I'm on 2000.

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • ALZDBA (5/23/2008)


    Jeff Moden (5/23/2008)


    ... or when parallism takes over. Even then, an Order By and a OPTION (MAXDOP1) will take care of that....

    :ermm:

    Just a little reminder of a nice article with the great test concerning this technique.

    It has the to the point title "Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5"

    The full article can be found at http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    The reminder I wanted to point to is this one:

    http://www.sqlservercentral.com/Forums/FindPost450363.aspx

    The caveot is partitioning ! (even the maxdop 1 will not save you from that).

    But except for that, ..... pure lightning :w00t:

    Thanks for that, Johan... thank goodness for temp tables, huh? 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jcrawf02 (5/23/2008)


    David Jackson (5/22/2008)


    Using your test data, what would you expect this to return?

    DECLARE @rebuildMe varchar(2000), @iteration int

    SET @rebuildMe = ''

    SELECT @rebuildMe = @rebuildMe + cast(n as varchar(10)) + ' '

    FROM #tally

    SELECT @rebuildMe

    Dave J

    yesterday, I would have expected that to return 11000 result sets, each with n+' ' tacked onto the end of the last. Today, I would assume it would return n+' '+(n+1)+' ' . . . etc up to 2000 characters.

    Cock on. When I try to explain how this works to developers, we always get into an argument about whether SQL is looping or not. Others have (hopefully) answered this for you now, better than I could! FWIW, this is the main difference between set thinking and procedural thinking. 😉

    You don't tell SQL how to do something, you ask it for what you want. Part of SQL's job is to work out for you how best to retrieve your data that matches your criteria. Your job is to work out how to ask efficiently. 😀

    Dave J


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • Cock on.

    Okay, after I cleaned the soda out of my nose, I had to ask - was this intentional (the only reason I think it was is that you have a rooster on your avatar)?

    -- Cory

Viewing 15 posts - 151 through 165 (of 511 total)

You must be logged in to reply to this topic. Login to reply