The Numbers Table

  • Comments posted to this topic are about the item The Numbers Table

  • Here is another solution without using loops....

    INSERT INTO #util_nums

    SELECT TOP 1000000 ROW_NUMBER() OVER (ORDER BY a.[object_id])

    FROM sys.columns a

    CROSS JOIN sys.columns b

    GO

  • Just an FYI... the source of the CTE in your article is by a fellow called Itzik Ben-Gan and can be found at the bottom of page 255 in a book titled "Inside Microsoft SQL Server 2005 - T-SQL Querying".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This topic was cover complitly by Itzik Ben-Gan years ago

  • Earlier coverage was more than this. But this one also is good. 🙂

  • I'd consider using a table valued function instead that only returns the rows I'm interested in:

    e.g.:

    CREATE FUNCTION [dbo].[Nums]

    (

    @StartNum int,

    @EndNum int

    )

    RETURNS @table TABLE (Num int)

    AS BEGIN

    DECLARE @tmpNum int

    SET @tmpNum = @StartNum

    WHILE @tmpNum < @EndNum

    BEGIN

    INSERT INTO @table VALUES (@tmpNum)

    SET @tmpNum = @tmpNum + 1

    END

    RETURN

    END

    GO

    Example Usage:

    SELECT * FROM Nums(1000, 2000)

  • jacroberts (11/24/2008)


    I'd consider using a table valued function instead that only returns the rows I'm interested in:

    e.g.:

    CREATE FUNCTION [dbo].[Nums]

    (

    @StartNum int,

    @EndNum int

    )

    RETURNS @table TABLE (Num int)

    AS BEGIN

    DECLARE @tmpNum int

    SET @tmpNum = @StartNum

    WHILE @tmpNum < @EndNum

    BEGIN

    INSERT INTO @table VALUES (@tmpNum)

    SET @tmpNum = @tmpNum + 1

    END

    RETURN

    END

    GO

    Example Usage:

    SELECT * FROM Nums(1000, 2000)

    Ummm... have you actually tried using one with a WHILE loop like that? A cross-joined CTE will be much quicker. The whole purpose of a Numbers or Tally table is to get rid of loops.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • yuvalpe (11/24/2008)


    This topic was cover complitly by Itzik Ben-Gan years ago

    So? There's lot's of folks that don't even know who he is, never mind what a Numbers or Tally table is.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jacroberts (11/24/2008)


    I'd consider using a table valued function instead that only returns the rows I'm interested in:

    e.g.:

    CREATE FUNCTION [dbo].[Nums]

    (

    @StartNum int,

    @EndNum int

    )

    RETURNS @table TABLE (Num int)

    AS BEGIN

    DECLARE @tmpNum int

    SET @tmpNum = @StartNum

    WHILE @tmpNum < @EndNum

    BEGIN

    INSERT INTO @table VALUES (@tmpNum)

    SET @tmpNum = @tmpNum + 1

    END

    RETURN

    END

    GO

    Example Usage:

    SELECT * FROM Nums(1000, 2000)

    Unfortunatly non-inline functions, including table valued ones, run miserably slow in SQL Server. I hope they change it in the future, but for now solutions like yours (and I used similar in the past) just won't deliver performance anywhere near that of a numbers table.

    The main issue in your solution (and solutions like it) is that multiple statements are exectuted. Each insert performed within the loop carries so much overhead that just reading a table from a cached page performs tons faster. The set based aproach only has to deal one time with the overhead compared to one time per iteration for the procedural one. On top of it...the return table (called @table in your case) is stored in the temp DB, so data is written to disk (which is slow), compared to a number table, which is just reading data, likely to already be in the cache.

    So as things stand:

    * When using a function, always try to use a inline table valued function as this is integrated in your query.

    * Do not fill up a table one row at a time (Jeff Moden calls it RBAR for good reason)

    That said, there are occasions where a non-inline table valued function does better then a inlined one (I recently has such a case). In certain complex queries where the optimiser makes wrong choices this can happen. But you should still fill up the result table in as few statements as possible, which is where a good number table or non-procedural number function comes in.

    There are good examples of number/tally functions to be found of this forum. Currently I am still in the process of optimizing one I found and modified. Testing unfotunatly takes time and I want the solution to be scalable across at least several threads, once its finished and if I do succeed, it will end up on this site as well for everyone to use.

  • It depends how much performance is an issue, for a lot of queries it is not. For example, if there is a daily report that runs in batch it would not matter. I've tested the query on my development environment it generated and queried the table with 4000 rows in in less than 1 second.

  • jacroberts (11/24/2008)


    It depends how much performance is an issue, for a lot of queries it is not. For example, if there is a daily report that runs in batch it would not matter. I've tested the query on my development environment it generated and queried the table with 4000 rows in in less than 1 second.

    That's pretty slow and performance should be always an issue...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Performance isn't always an issue as in my previous example.

  • Performance might not always be critical, but it is and should always be an issue for anything but a known to be 'short' ad-doc operation. It is good practice to have several well optimized tools in your toolbox and since a number table or number function is so generic, I consider them one of my tools. There is no reason not to use an optimised generic tool over a non-optimized one, ever.

  • Thanks for all the responses and thanks yuvalpe for identifying the original source of that query. Like I said, I never claimed to be the author. And I looked for the original source.

    MrAkki: The only problem I see with that is that if you are building this in the early stages of building a database you could end up with a smaller numbers table than you need. That can ALWAYS be solved by adding more cross joins, but I liked this simple approach and it performs quite well.

    I wasn't attempting to list every possibly method for creating a numbers table, just a few different ways with some pros and cons of each.

  • The first place I came across using a numbers table was from a Joe Celko book. I take a slightly different approach, typically using a view (with the following CTE) to generate the numbers.

    WITH

    I AS (SELECT iNbr = 0 UNION SELECT iNbr = 1 UNION

    SELECT iNbr = 2 UNION SELECT iNbr = 3 UNION

    SELECT iNbr = 4 UNION SELECT iNbr = 5 UNION

    SELECT iNbr = 6 UNION SELECT iNbr = 7 UNION

    SELECT iNbr = 8 UNION SELECT iNbr = 9 ),

    X AS (SELECT iNbr = I.iNbr * 10 FROM I ),

    C AS (SELECT iNbr = I.iNbr * 100 FROM I ),

    M AS (SELECT iNbr = I.iNbr * 1000 FROM I ),

    N AS (

    SELECT iNbr = I.iNbr + X.iNbr + C.iNbr + M.iNbr

    FROMI

    CROSS JOIN X

    CROSS JOIN C

    CROSS JOIN M

    )

    SELECTiNbr

    FROMN

    ORDER BY iNbr

    I started running into places where I needed the numbers on different servers, with different ranges, etc. I ended up using something like this (above) instead of a table. I do not have any reference as to performance or speed, but it does work quickly in my environment.

    Beer's Law: Absolutum obsoletum
    "if it works it's out-of-date"

Viewing 15 posts - 1 through 15 (of 106 total)

You must be logged in to reply to this topic. Login to reply