The "Numbers" or "Tally" Table: What it is and how it replaces a loop.

  • Ed Wagner

    SSC Guru

    Points: 286960

    Eric M Russell - Wednesday, April 12, 2017 11:30 AM

    Microsoft should simply add a system tally table to SQL Server.

    Agreed.  The MS Connect item requesting built-in table of numbers has celebrated its 10th anniversary - and is still active!!!  Erland Sommarskog submitted it back on 2/18/2007.

    https://connect.microsoft.com/SQLServer/feedback/details/258733/add-a-built-in-table-of-numbers

  • Eric M Russell

    SSC Guru

    Points: 125032

    Oracle has this special built-in table called DUAL containing (1) row and a single column called DUMMY with a value of 'X'. Because Oracle doesn't support using a SELECT statement without a table name, DUAL is typically used for doing something like "SELECT SYSDATE FROM DUAL". The lack of imagination behind this "DUAL" table suggests it was added as an afterthought at some point in the distant past.

    I hope that SQL Server can on-up them with a proper TALLY table.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Jeff Moden

    SSC Guru

    Points: 995171

    Eirikur Eiriksson - Wednesday, April 12, 2017 11:36 AM

    Eric M Russell - Wednesday, April 12, 2017 11:30 AM

    Microsoft should simply add a system tally table to SQL Server.

    And of course with data compression (page) as a default
    😎

    Heh... no.... it shouldn't be a table.  It should be a nasty fast, machine language level function that would blow table access away.  😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 995171

    Ed Wagner - Wednesday, April 12, 2017 1:30 PM

    Eric M Russell - Wednesday, April 12, 2017 11:30 AM

    Microsoft should simply add a system tally table to SQL Server.

    Agreed.  The MS Connect item requesting built-in table of numbers has celebrated its 10th anniversary - and is still active!!!  Erland Sommarskog submitted it back on 2/18/2007.

    https://connect.microsoft.com/SQLServer/feedback/details/258733/add-a-built-in-table-of-numbers

    I should patent my water powder... just add water. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Thomas Rushton

    SSC-Insane

    Points: 22624

    Jeff Moden - Wednesday, April 12, 2017 3:44 PM

    I should patent my water powder... just add water. 😉

    I think you might be a bit late to the party on that one - I remember selling boxes of "Instant Water - Just add water" at a school fete back in 1980/1981...

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • eliassal

    Ten Centuries

    Points: 1294

    Jeff, can you please explain why you use Cross join as follows


    SELECT TOP 11000 --equates to more than 30 years of dates
            IDENTITY(INT,1,1) AS N
       INTO dbo.Tally
       FROM Master.dbo.SysColumns sc1
       ,Master.dbo.SysColumns sc2

     I ran this T-sql without the
     Master.dbo.SysColumns sc2

    I get the exact same results, so what is the idea behind this join "
    Master.dbo.SysColumns sc2", I get the exact same results, so what is the idea behind this join?

  • Jonathan AC Roberts

    SSCoach

    Points: 17010

    eliassal - Thursday, March 7, 2019 3:19 AM

    Jeff, can you please explain why you use Cross join as follows


    SELECT TOP 11000 --equates to more than 30 years of dates
            IDENTITY(INT,1,1) AS N
       INTO dbo.Tally
       FROM Master.dbo.SysColumns sc1
       ,Master.dbo.SysColumns sc2

     I ran this T-sql without the
     Master.dbo.SysColumns sc2

    I get the exact same results, so what is the idea behind this join "
    Master.dbo.SysColumns sc2", I get the exact same results, so what is the idea behind this join?

    If you wanted to increase the number of rows from TOP 11000 to say 1000000 you would see that the join is needed. It's just a cartesian join to make sure enough rows are returned by the query.

  • Jeff Moden

    SSC Guru

    Points: 995171

    eliassal - Thursday, March 7, 2019 3:19 AM

    Jeff, can you please explain why you use Cross join as follows


    SELECT TOP 11000 --equates to more than 30 years of dates
            IDENTITY(INT,1,1) AS N
       INTO dbo.Tally
       FROM Master.dbo.SysColumns sc1
       ,Master.dbo.SysColumns sc2

     I ran this T-sql without the
     Master.dbo.SysColumns sc2

    I get the exact same results, so what is the idea behind this join "
    Master.dbo.SysColumns sc2", I get the exact same results, so what is the idea behind this join?

    It used to be (way back in SQL Server 2000) that Master.dbo.SysColumns only had about 4000 rows in it and the CROSS JOIN to produce a large enough "Pseudo Cursor" was essential.

    I also wrote the article before I had access to 2005 and before 2008.  In SQL Server 2016, I'm seeing about 15K rows and while that's certainly enough for an 11K row "Psuedo Cursor", it's not enough for a million row "Pseudo Cursor", which I frequently do tests with.  As Jonathan states, it's just in case (and I frequently do) I need something larger.  I always include the CROSS JOIN even if I need only 1K rows for two reasons... it makes things a bit more bullet proof in the face of scalability (and building a Tally table is just the beginning of what I use Pseudo-Cursors for) and it makes things consistent without extra overhead.  It's all a part of pattern recognition in code. 

    Also, if someone else sees the code and tries to use it for something larger than 11K rows, they stand a good chance of it working even if they don't know what the heck it does. 😀

    Also, the Master.dbo.syscolumns table has been converted to a "compatibility view" as of 2005 and has been deprecated.  I'm actually surprised that they haven't actually killed it yet (that usually means that MS still has some code somewhere that still has a dependency on it).  You should stop using it ASAP and start using sys.all_columns, which ships with more than 4000 rows as of 2005, has about 9K rows in 2016, doesn't require 3 part naming, and isn't going to be deprecated anytime in the near future (well, can't promise that... we are talking about MS here).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • drew.allen

    SSC Guru

    Points: 76662

    Jeff Moden - Thursday, March 7, 2019 6:39 AM

    eliassal - Thursday, March 7, 2019 3:19 AM

    Jeff, can you please explain why you use Cross join as follows


    SELECT TOP 11000 --equates to more than 30 years of dates
            IDENTITY(INT,1,1) AS N
       INTO dbo.Tally
       FROM Master.dbo.SysColumns sc1
       ,Master.dbo.SysColumns sc2

     I ran this T-sql without the
     Master.dbo.SysColumns sc2

    I get the exact same results, so what is the idea behind this join "
    Master.dbo.SysColumns sc2", I get the exact same results, so what is the idea behind this join?

    It used to be (way back in SQL Server 2000) that Master.dbo.SysColumns only had about 4000 rows in it and the CROSS JOIN to produce a large enough "Pseudo Cursor" was essential.

    I also wrote the article before I had access to 2005 and before 2008.  In SQL Server 2016, I'm seeing about 15K rows and while that's certainly enough for an 11K row "Psuedo Cursor", it's not enough for a million row "Pseudo Cursor", which I frequently do tests with.  As Jonathan states, it's just in case (and I frequently do) I need something larger.  I always include the CROSS JOIN even if I need only 1K rows for two reasons... it makes things a bit more bullet proof in the face of scalability (and building a Tally table is just the beginning of what I use Pseudo-Cursors for) and it makes things consistent without extra overhead.  It's all a part of pattern recognition in code. 

    Also, if someone else sees the code and tries to use it for something larger than 11K rows, they stand a good chance of it working even if they don't know what the heck it does. 😀

    Also, the Master.dbo.syscolumns table has been converted to a "compatibility view" as of 2005 and has been deprecated.  I'm actually surprised that they haven't actually killed it yet (that usually means that MS still has some code somewhere that still has a dependency on it).  You should stop using it ASAP and start using sys.all_columns, which ships with more than 4000 rows as of 2005, has about 9K rows in 2016, doesn't require 3 part naming, and isn't going to be deprecated anytime in the near future (well, can't promise that... we are talking about MS here).

    Since the introduction of table value constructors, you shouldn't be reading physical tables to create a tally table at all.  Itzik Ben-Gan came up with a blazing fast version that uses only table value constructors and CTEs.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • eliassal

    Ten Centuries

    Points: 1294

    drew, you indicate
    "Itzik Ben-Gan came up with a blazing fast version that uses only table value constructors and CTEs"
    Can you please share the url for the article?
    hanksT

  • Jonathan AC Roberts

    SSCoach

    Points: 17010

    eliassal - Thursday, March 7, 2019 9:41 AM

    drew, you indicate
    "Itzik Ben-Gan came up with a blazing fast version that uses only table value constructors and CTEs"
    Can you please share the url for the article?
    hanksT

    I think this is it: http://tsql.solidq.com/SourceCodes/GetNums.txt It's a bit long-winded if you have a later version of SQL Server that can uses VALUES, it could be shortened to something like this:
    DECLARE @Count bigint=1000000;
    WITH A(A) AS (SELECT '' FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) T(C))
    SELECT TOP(@Count) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) N FROM A A,A B,A C,A D,A E,A F,A G,A H -- 16^8

    But don't forget, the script in the article is for a one-off run to populate a permanent tally table. So slight performance gains using different code are not a critical issue for the population code in the article. I'm not actually sure if it would be any faster than the code in the article.

  • drew.allen

    SSC Guru

    Points: 76662

    There is a write-up on Dwain Camps' site.  Tally Tables.  If can can, you want to use that to create an inline table-valued function.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Alan Burstein

    SSC Guru

    Points: 61067

    drew.allen - Thursday, March 7, 2019 8:49 AM

    Jeff Moden - Thursday, March 7, 2019 6:39 AM

    eliassal - Thursday, March 7, 2019 3:19 AM

    Jeff, can you please explain why you use Cross join as follows


    SELECT TOP 11000 --equates to more than 30 years of dates
            IDENTITY(INT,1,1) AS N
       INTO dbo.Tally
       FROM Master.dbo.SysColumns sc1
       ,Master.dbo.SysColumns sc2

     I ran this T-sql without the
     Master.dbo.SysColumns sc2

    I get the exact same results, so what is the idea behind this join "
    Master.dbo.SysColumns sc2", I get the exact same results, so what is the idea behind this join?

    It used to be (way back in SQL Server 2000) that Master.dbo.SysColumns only had about 4000 rows in it and the CROSS JOIN to produce a large enough "Pseudo Cursor" was essential.

    I also wrote the article before I had access to 2005 and before 2008.  In SQL Server 2016, I'm seeing about 15K rows and while that's certainly enough for an 11K row "Psuedo Cursor", it's not enough for a million row "Pseudo Cursor", which I frequently do tests with.  As Jonathan states, it's just in case (and I frequently do) I need something larger.  I always include the CROSS JOIN even if I need only 1K rows for two reasons... it makes things a bit more bullet proof in the face of scalability (and building a Tally table is just the beginning of what I use Pseudo-Cursors for) and it makes things consistent without extra overhead.  It's all a part of pattern recognition in code. 

    Also, if someone else sees the code and tries to use it for something larger than 11K rows, they stand a good chance of it working even if they don't know what the heck it does. 😀

    Also, the Master.dbo.syscolumns table has been converted to a "compatibility view" as of 2005 and has been deprecated.  I'm actually surprised that they haven't actually killed it yet (that usually means that MS still has some code somewhere that still has a dependency on it).  You should stop using it ASAP and start using sys.all_columns, which ships with more than 4000 rows as of 2005, has about 9K rows in 2016, doesn't require 3 part naming, and isn't going to be deprecated anytime in the near future (well, can't promise that... we are talking about MS here).

    Since the introduction of table value constructors, you shouldn't be reading physical tables to create a tally table at all.  Itzik Ben-Gan came up with a blazing fast version that uses only table value constructors and CTEs.

    Drew

     Itzik Ben-Gan came up with a blazing fast version that uses only table value constructors and CTEs.

    So did Jeff. His is called fnTally, it's also super fast and has some advantages over Itzik's version. I, too, have one which borrows ideas Itzik's and Jeff's along along with my own special features (e.g. the concept of "gaps" which allows you to return, say, all even numbers between 1 and 100 by generating 50 rows vs 100 rows+filtering AND the ability to return those numbers in descending order without a sort.) I'm not recommending one vs the another but I do believe people should start using a function for their tally table needs instead of constantly copy/pasting CTE Tally table code or using system tables (which are fine but their usage can confuse folks.)

    I always have persisted tally table around too. There are still cases where there is not substitute for a correctly indexed dbo.tally. For example, you can't use TVFs in indexed views. 


    CREATE FUNCTION dbo.rangeAB
    (
     @low BIGINT,
     @high BIGINT,
     @gap BIGINT,
     @row1 BIT
    )
    /****************************************************************************************
    [Purpose]:
    Creates a lazy, memory optimized, ORDERED sequence of up to 531,441,000,000 integers
    between @low and @high (inclusive). RangeAB is a pure, 100% set-based alternative to
    solving T-SQL problems using iterative methods such as loops, cursors and recursive CTEs.
    RangeAB is based on Itzik Ben-Gan's getnums function for producing a range of numbers and
    borrows logic from Jeff Moden's fnTally for creating an optional 0-based Row Number (RN.)

    RangeAB adds more functionality to mimic Clojure and Python's Range functions. The main
    difference is that rangeAB is inclusive. "Range" is a reserved SQL keyword which is why I
    chose the name, "RangeAB".

    [Author]: Alan Burstein

    [Compatibility]:
    SQL Server 2008+ and Azure SQL Database

    [Syntax]:
    SELECT r.RN, r.Op, r.N1, r.N2
    FROM dbo.rangeAB(@low,@high,@gap,@row1) AS r;

    [Parameters]:
    @low = BIGINT; represents the lowest value for N1.
    @high = BIGINT; represents the highest value for N1.
    @gap = BIGINT; represents how much N1 and N2 will increase each row. @gap also
       represents the difference between N1 and N2.
    @row1 = BIT; represents the first value of RN. When @row = 0 then RN starts at 0,
       when @row = 1 then RN starts at 1.

    [Returns]:
    Inline Table Valued Function returns:
    RN = BIGINT; a row number that works just like T-SQL ROW_NUMBER() except that it can
      start at 0 or 1 which is dictated by @row1.
    OP = BIGINT; returns the "opposite number that relates to rn. When rn begins with 0 and
      ends with 10 then 10 is the opposite of 0, 9 the opposite of 1, etc. When rn begins
      with 1 and ends with 5 then 1 is the opposite of 5, 2 the opposite of 4, etc...
    N1 = BIGINT; a *Lazy* sequence of numbers starting at @low and incrimenting by @gap until
      the next number in the sequence is greater than @high.
    N2 = BIGINT; a lazy sequence of numbers starting @low+@gap and incrimenting by @gap.

    [Dependencies]:
    N/A

    [Developer Notes]:
    1. The lowest and highest possible numbers returned are whatever is allowable by a
      bigint. The function, however, returns no more than 531,441,000,000 rows (8100^3).
    2. @gap does not affect rn, rn will begin at @row1 and increase by 1 until the last row
      unless its used in a query where a filter is applied to rn.
    3. @gap must be greater than 0 or the function will not return any rows.
    4. Keep in mind that when @row1 is 0 then the highest row-number will be the number of
      rows returned minus 1
    5. If you only need is a sequential set beginning at 0 or 1 then, for best performance
      use the RN column. Use N1 and/or N2 when you need to begin your sequence at any
      number other than 0 or 1 or if you need a gap between your sequence of numbers.
    6. Although @gap is a bigint it must be a positive integer or the function will
      not return any rows.
    7. The function will not return any rows when one of the following conditions are true:
      * any of the input parameters are NULL
      * @high is less than @low
      * @gap is not greater than 0
      To force the function to return all NULLs instead of not returning anything you can
      add the following code to the end of the query:

      UNION ALL
      SELECT NULL, NULL, NULL, NULL
      WHERE NOT (@high&@low&@gap&@row1 IS NOT NULL AND @high >= @low AND @gap > 0)

      This code was excluded as it adds a ~5% performance penalty.
    8. There is no performance penalty for sorting by rn ASC; there is a large performance
      penalty for sorting in descending order WHEN @row1 = 1; WHEN @row1 = 0
      If you need a descending sort the use OP in place of RN then sort by rn ASC.
    9. For 2012+ systems, The TOP logic can be replaced with:
     OFFSET 0 ROWS FETCH NEXT
      ABS((ISNULL(@high,0)-ISNULL(@low,0))/ISNULL(@gap,0)+ISNULL(@row1,1)) ROWS ONLY

    Best Practices:
    --===== 1. Using RN (rownumber)
    -- (1.1) The best way to get the numbers 1,2,3...@high (e.g. 1 to 5):
    SELECT RN FROM dbo.rangeAB(1,5,1,1);
    -- (1.2) The best way to get the numbers 0,1,2...@high-1 (e.g. 0 to 5):
    SELECT RN FROM dbo.rangeAB(0,5,1,0);

    --===== 2. Using OP for descending sorts without a performance penalty
    -- (2.1) The best way to get the numbers 5,4,3...@high (e.g. 5 to 1):
    SELECT op FROM dbo.rangeAB(1,5,1,1) ORDER BY rn ASC;
    -- (2.2) The best way to get the numbers 0,1,2...@high-1 (e.g. 5 to 0):
    SELECT op FROM dbo.rangeAB(1,6,1,0) ORDER BY rn ASC;

    --===== 3. Using N1
    -- (3.1) To begin with numbers other than 0 or 1 use N1 (e.g. -3 to 3):
    SELECT N1 FROM dbo.rangeAB(-3,3,1,1);
    -- (3.2) ROW_NUMBER() is built in. If you want a ROW_NUMBER() include RN:
    SELECT RN, N1 FROM dbo.rangeAB(-3,3,1,1);
    -- (3.3) If you wanted a ROW_NUMBER() that started at 0 you would do this:
    SELECT RN, N1 FROM dbo.rangeAB(-3,3,1,0);

    --===== 4. Using N2 and @gap
    -- (4.1) To get 0,10,20,30...100, set @low to 0, @high to 100 and @gap to 10:
    SELECT N1 FROM dbo.rangeAB(0,100,10,1);
    -- (4.2) Note that N2=N1+@gap; this allows you to create a sequence of ranges.
    --   For example, to get (0,10),(10,20),(20,30).... (90,100):
    SELECT N1, N2 FROM dbo.rangeAB(0,90,10,1);
    -- (4.3) Remember that a rownumber is included and it can begin at 0 or 1:
    SELECT RN, N1, N2 FROM dbo.rangeAB(0,90,10,1);

    [Examples]:
    --===== 1. Generating Sample data (using rangeAB to create "dummy rows")
    -- The query below will generate 10,000 ids and random numbers between 50,000 and 500,000
    SELECT
     someId  = r.RN,
     someNumer = ABS(CHECKSUM(NEWID())%450000)+50001
    FROM rangeAB(1,10000,1,1) r;

    --===== 2. Create a series of dates; rn is 0 to include the first date in the series
    DECLARE @startdate DATE = '20180101', @enddate DATE = '20180131';

    SELECT r.RN, calDate = DATEADD(dd, r.RN, @startdate)
    FROM dbo.rangeAB(1, DATEDIFF(dd,@startdate,@enddate),1,0) r;
    GO

    --===== 3. Splitting (tokenizing) a string with fixed sized items
    -- given a delimited string of identifiers that are always 7 characters long
    DECLARE @string VARCHAR(1000) = 'A601225,B435223,G008081,R678567';

    SELECT
     itemNumber = r.RN, -- item's ordinal position
     itemIndex = r.n1, -- item's position in the string (it's CHARINDEX value)
     item   = SUBSTRING(@string, r.n1, 7) -- item (token)
    FROM dbo.rangeAB(1, LEN(@string), 8,1) r;
    GO

    --===== 4. Splitting (tokenizing) a string with random delimiters
    DECLARE @string VARCHAR(1000) = 'ABC123,999F,XX,9994443335';

    SELECT
     itemNumber = ROW_NUMBER() OVER (ORDER BY r.RN), -- item's ordinal position
     itemIndex = r.n1+1, -- item's position in the string (it's CHARINDEX value)
     item   = SUBSTRING
         (
          @string,
          r.n1+1,
          ISNULL(NULLIF(CHARINDEX(',',@string,r.n1+1),0)-r.n1-1, 8000)
         ) -- item (token)
    FROM dbo.rangeAB(0,DATALENGTH(@string),1,1) r
    WHERE SUBSTRING(@string,r.n1,1) = ',' OR r.n1 = 0;
    -- logic borrowed from: http://www.sqlservercentral.com/articles/Tally+Table/72993/

    --===== 5. Grouping by a weekly intervals
    -- 5.1. how to create a series of start/end dates between @startDate & @endDate
    DECLARE @startDate DATE = '1/1/2015', @endDate DATE = '2/1/2015';
    SELECT
     WeekNbr = r.RN,
     WeekStart = DATEADD(DAY,r.N1,@StartDate),
     WeekEnd = DATEADD(DAY,r.N2-1,@StartDate)
    FROM dbo.rangeAB(0,datediff(DAY,@StartDate,@EndDate),7,1) r;
    GO

    -- 5.2. LEFT JOIN to the weekly interval table
    BEGIN
    DECLARE @startDate datetime = '1/1/2015', @endDate datetime = '2/1/2015';
    -- sample data
    DECLARE @loans TABLE (loID INT, lockDate DATE);
    INSERT @loans SELECT r.RN, DATEADD(dd, ABS(CHECKSUM(NEWID())%32), @startDate)
    FROM dbo.rangeAB(1,50,1,1) r;

    -- solution
    SELECT
      WeekNbr = r.RN,
      WeekStart = dt.WeekStart,
      WeekEnd = dt.WeekEnd,
      total  = COUNT(l.lockDate)
    FROM dbo.rangeAB(0,datediff(DAY,@StartDate,@EndDate),7,1) r
    CROSS APPLY (VALUES (
      CAST(DATEADD(DAY,r.N1,@StartDate) AS DATE),
      CAST(DATEADD(DAY,r.N2-1,@StartDate) AS DATE))) dt(WeekStart,WeekEnd)
    LEFT JOIN @loans l ON l.lockDate BETWEEN dt.WeekStart AND dt.WeekEnd
    GROUP BY r.RN, dt.WeekStart, dt.WeekEnd ;
    END;

    --===== 6. Identify the first vowel and last vowel in a along with their positions
    DECLARE @string VARCHAR(200) = 'This string has vowels';

    SELECT TOP(1) position = r.RN, letter = SUBSTRING(@string,r.RN,1)
    FROM dbo.rangeAB(1,LEN(@string),1,1) r
    WHERE SUBSTRING(@string,r.RN,1) LIKE '%[aeiou]%'
    ORDER BY r.RN;

    -- To avoid a sort in the execution plan we'll use OP instead of RN
    SELECT TOP(1) position = r.Op, letter = SUBSTRING(@string,r.Op,1)
    FROM dbo.rangeAB(1,LEN(@string),1,1) r
    WHERE SUBSTRING(@string,r.RN,1) LIKE '%[aeiou]%'
    ORDER BY r.RN;

    -----------------------------------------------------------------------------------------
    [Revision History]:
    Rev 00 - 20140518 - Initial Development - AJB
    Rev 01 - 20151029 - Added 65 rows. Now L1=465; 465^3=100.5M. Updated comments - AJB
    Rev 02 - 20180613 - Complete re-design including opposite number column (op)
    Rev 03 - 20180920 - Added additional CROSS JOIN to L2 for 530B rows max - AJB
    Rev 04 - 20190306 - Added inline aliasing function(f):
           f.R=(@high-@low)/@gap, f.N=@gap+@low - AJB
    *****************************************************************************************/
    RETURNS TABLE WITH SCHEMABINDING AS RETURN
    WITH
    f(R,N) AS (SELECT (@high-@low)/@gap, @gap+@low),
    L1(N) AS
    (
    SELECT 1
    FROM (VALUES
     (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
     (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
     (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
     (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
     (0),(0)) T(N) -- 90 values
    ),
    L2(N) AS (SELECT 1 FROM L1 a CROSS JOIN L1 b CROSS JOIN L1 c),
    iTally AS (SELECT RN = ROW_NUMBER() OVER (ORDER BY (SELECT 1)) FROM L2 a CROSS JOIN L2 b)
    SELECT r.RN, r.Op, r.N1, r.N2
    FROM
    (
    SELECT
      RN = 0,  -- ROW_NUMBER()
      OP = f.R, -- O(RN)
      N1 = @low, -- N
      N2 = f.N -- LEAD(N,1) OVER (ORDER BY RN)
    FROM f
    WHERE @row1 = 0
    UNION ALL
    SELECT TOP (ABS((ISNULL(@high,0)-ISNULL(@low,0))/ISNULL(@gap,0)+ISNULL(@row1,1)))
      RN = i.RN,
      OP = f.R+(2*@row1)-i.RN,
      N1 = f.N*(i.RN-@row1),
      N2 = f.N*(i.RN-(@row1-1))
    FROM   f
    CROSS JOIN iTally AS i
    ORDER BY i.RN
    ) AS r
    WHERE @high&@low&@gap&@row1 IS NOT NULL AND @high >= @low
    AND @gap > 0;

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Jeff Moden

    SSC Guru

    Points: 995171

    drew.allen - Thursday, March 7, 2019 8:49 AM

    Jeff Moden - Thursday, March 7, 2019 6:39 AM

    eliassal - Thursday, March 7, 2019 3:19 AM

    Jeff, can you please explain why you use Cross join as follows


    SELECT TOP 11000 --equates to more than 30 years of dates
            IDENTITY(INT,1,1) AS N
       INTO dbo.Tally
       FROM Master.dbo.SysColumns sc1
       ,Master.dbo.SysColumns sc2

     I ran this T-sql without the
     Master.dbo.SysColumns sc2

    I get the exact same results, so what is the idea behind this join "
    Master.dbo.SysColumns sc2", I get the exact same results, so what is the idea behind this join?

    It used to be (way back in SQL Server 2000) that Master.dbo.SysColumns only had about 4000 rows in it and the CROSS JOIN to produce a large enough "Pseudo Cursor" was essential.

    I also wrote the article before I had access to 2005 and before 2008.  In SQL Server 2016, I'm seeing about 15K rows and while that's certainly enough for an 11K row "Psuedo Cursor", it's not enough for a million row "Pseudo Cursor", which I frequently do tests with.  As Jonathan states, it's just in case (and I frequently do) I need something larger.  I always include the CROSS JOIN even if I need only 1K rows for two reasons... it makes things a bit more bullet proof in the face of scalability (and building a Tally table is just the beginning of what I use Pseudo-Cursors for) and it makes things consistent without extra overhead.  It's all a part of pattern recognition in code. 

    Also, if someone else sees the code and tries to use it for something larger than 11K rows, they stand a good chance of it working even if they don't know what the heck it does. 😀

    Also, the Master.dbo.syscolumns table has been converted to a "compatibility view" as of 2005 and has been deprecated.  I'm actually surprised that they haven't actually killed it yet (that usually means that MS still has some code somewhere that still has a dependency on it).  You should stop using it ASAP and start using sys.all_columns, which ships with more than 4000 rows as of 2005, has about 9K rows in 2016, doesn't require 3 part naming, and isn't going to be deprecated anytime in the near future (well, can't promise that... we are talking about MS here).

    Since the introduction of table value constructors, you shouldn't be reading physical tables to create a tally table at all.  Itzik Ben-Gan came up with a blazing fast version that uses only table value constructors and CTEs.

    Drew

    Heh... not quite right.  The Tally Table is still faster than the table constructors.  It's not by much and it's less noticeable for most folks due to the speed of today's machines but, when you're willing to trade reads for speed, a physical Tally Table is difficult to beat.

    And, no... I have no bad words for Itzik's method (which I refer to as cCTEs or Cascading CTEs).  Done correctly, it's blazing fast and produces no reads.

    As with all else in SQL Server, "It Depends". 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Luis Cazares

    SSC Guru

    Points: 183583

    Jeff Moden - Thursday, March 7, 2019 4:12 PM

    Heh... not quite right.  The Tally Table is still faster than the table constructors.  It's not by much and it's less noticeable for most folks due to the speed of today's machines but, when you're willing to trade reads for speed, a physical Tally Table is difficult to beat.

    And, no... I have no bad words for Itzik's method (which I refer to as cCTEs or Cascading CTEs).  Done correctly, it's blazing fast and produces no reads.

    As with all else in SQL Server, "It Depends". 😀

    I believe that Drew meant that you shouldn't read physical tables to create a tally table, not that you shouldn't use a physical tally table. I'm not sure if someone has actually tested the performance of an In-Memory table. I haven't had the opportunity to work with those on real systems.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 15 posts - 451 through 465 (of 498 total)

You must be logged in to reply to this topic. Login to reply