Using a CTE as a Tally Table

  • Comments posted to this topic are about the item Using a CTE as a Tally Table

  • This part of your code causes a full table scan regardless of the TOP 52 because of the ORDER BY in the ROW_NUMBER().  Since you're doing the ORDER BY on the StockCode table, it also does a shedload of unnecessary reads.  Also, there is no guarantee that you'll return 52 rows because there's no guarantee that the Data.Stock table will start with or be maintained in such a fashion as to always have at least 52 rows in it because it's a user table rather than a utility or fix system table.

    SELECT TOP 52 ROW_NUMBER() OVER (ORDER BY StockCode) AS Num
    FROM Data.Stock

    This part of your code causes a full table scan because the WHERE clause is non-SARGable as well as there being no indexes on the table.

    SELECT SUM(TotalSalePrice) AS SalesForTheWeek
                 
    ,DatePart(wk, SaleDate) AS WeekNo
    FROM Data.Sales
    WHERE YEAR(SaleDate) = 2016
    GROUP BY DatePart(wk, SaleDate)

    Also, you should always use the 2 part naming convention both in the FROM clause (which you've dome) and the SELECT clause of a joined select (which you've not done).

    Also, what is a "week"?  DATEPART(wk) will only return 2 days for the first week of Jan 2016.  The only reason why it returns 7 days for the last week of 2016 is because you get lucky and average of once every 7 years.

    Last but not least, your sample data model in the PrestigeCars database is devoid of any and all Primary Keys and indexes.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Cannot see any reason for scanning the table to produce the week numbers when it can be done more efficiently without it.
    😎

    Here is a far better method with only two constant scans and no sort operator.


    USE TEEST;
    GO
    SET NOCOUNT ON;

    ;WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (SELECT TOP (52) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N
         FROM T T1,T T2)
    SELECT
      NM.N AS WEEK_NUM
    FROM  NUMS  NM
    ORDER BY NM.N;

  • Rather use a recursive cte to create your numbers.

    WITH cte AS (
       SELECT 1 as Num
       UNION ALL
       SELECT Num + 1 FROM cte  WHERE Num < 52
        )
    SELECT * FROM cte

    This method works well for populating Calendar tables, too.

    But if you insist on using an existing table, sys.columns is just about guaranteed to always have enough entries.

  • BluePeter - Thursday, August 9, 2018 2:20 AM

    Rather use a recursive cte to create your numbers.

    WITH cte AS (
       SELECT 1 as Num
       UNION ALL
       SELECT Num + 1 FROM cte  WHERE Num < 52
        )
    SELECT * FROM cte

    This method works well for populating Calendar tables, too.

    But if you insist on using an existing table, sys.columns is just about guaranteed to always have enough entries.

    Be careful here, the recursive CTE is much slower than the inline tally table CTE method I posted, strongly advice against using a recursive CTE for this purpose.
    😎

  • A small point, referred to above, there are 53 weeks in a financial year, to catch the few days at each end. Thanks for the article though. Good topic and I'm sure the comments above are meant to help, not criticise.


    Tks,

    JK

  • Thank you for the article!
    In the discussion everything important is mentioned.But the most important: Without this article no discussion would be happen. Therefore Thank you once more!

    Gerhard P.Advanced BI DeveloperAustria

  • Eirikur Eiriksson - Thursday, August 9, 2018 2:47 AM

    Be careful here, the recursive CTE is much slower than the inline tally table CTE method I posted, strongly advice against using a recursive CTE for this purpose.
    😎

    Agreed, it doesn't scale upwards, 10,000 is slower with a recursive cte.
    But no difference for 52 (or 53)
    I'll keep yours in mind, never seen it before.

  • BluePeter - Thursday, August 9, 2018 5:07 AM

    Eirikur Eiriksson - Thursday, August 9, 2018 2:47 AM

    Be careful here, the recursive CTE is much slower than the inline tally table CTE method I posted, strongly advice against using a recursive CTE for this purpose.
    😎

    Agreed, it doesn't scale upwards, 10,000 is slower with a recursive cte.
    But no difference for 52 (or 53)
    I'll keep yours in mind, never seen it before.

    The reason you think there is no difference is because of what you're measuring and the way your measuring it.  There's another thing to consider, as well...

    If you practice the wrong way just because of low row counts, you'll never get good at the right way.  You also don't have to actually build the cCTE method (Cascading CTEs rather than Recursive CTEs) each and every time you want a sequence of numbers.  You can easily build an iTVF (Inline Table Valued Function) that will do that for you and, unlike scalar functions and mTVFs (Mult-statment Table Valued Functions), provides no additional hit if you refer to it within other iTVFs.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • 53 is probably sufficient, but if a leap year starts on Saturday then Sunday Dec 31 will be in week 54.  Hopefully they don't mind missing Sunday sales on New Year's Eve once every 28 years.  This happens in 1916, 1944, 1972, 2000, 2028.

    You could merge that day into week 53:

    SELECT   SUM(TotalSalePrice) AS SalesForTheWeek
            ,CASE w.WeekNo WHEN 54 THEN 53 ELSE w.WeekNo END AS WeekNo
    FROM    Data.Sales
    CROSS APPLY ( SELECT WeekNo = DatePart(wk, SaleDate) ) w
    WHERE   YEAR(SaleDate) = 2016
    GROUP BY  CASE w.WeekNo WHEN 54 THEN 53 ELSE w.WeekNo END

  • Scott Coleman - Thursday, August 9, 2018 8:30 AM

    53 is probably sufficient, but if a leap year starts on Saturday then Sunday Dec 31 will be in week 54.  Hopefully they don't mind missing Sunday sales on New Year's Eve once every 28 years.  This happens in 1916, 1944, 1972, 2000, 2028.

    You could merge that day into week 53:

    SELECT   SUM(TotalSalePrice) AS SalesForTheWeek
            ,CASE w.WeekNo WHEN 54 THEN 53 ELSE w.WeekNo END AS WeekNo
    FROM    Data.Sales
    CROSS APPLY ( SELECT WeekNo = DatePart(wk, SaleDate) ) w
    WHERE   YEAR(SaleDate) = 2016
    GROUP BY  CASE w.WeekNo WHEN 54 THEN 53 ELSE w.WeekNo END

    Wow! Did not know that. We put the last "5" weeks in period 12, so we will likely merge that "extra" day into week 53 also. Will put a slight skew on YoY figures, but so does Easter, so we will live with it :-/ Tks for that


    Tks,

    JK

  • Jeff Moden - Thursday, August 9, 2018 8:22 AM

    BluePeter - Thursday, August 9, 2018 5:07 AM

    Eirikur Eiriksson - Thursday, August 9, 2018 2:47 AM

    Be careful here, the recursive CTE is much slower than the inline tally table CTE method I posted, strongly advice against using a recursive CTE for this purpose.
    😎

    Agreed, it doesn't scale upwards, 10,000 is slower with a recursive cte.
    But no difference for 52 (or 53)
    I'll keep yours in mind, never seen it before.

    The reason you think there is no difference is because of what you're measuring and the way your measuring it.  There's another thing to consider, as well...

    If you practice the wrong way just because of low row counts, you'll never get good at the right way.  You also don't have to actually build the cCTE method (Cascading CTEs rather than Recursive CTEs) each and every time you want a sequence of numbers.  You can easily build an iTVF (Inline Table Valued Function) that will do that for you and, unlike scalar functions and mTVFs (Mult-statment Table Valued Functions), provides no additional hit if you refer to it within other iTVFs.

    I get that Jeff...  Thanx.
    My original point was not to rely on having rows in a User table.  And the recursive cte method was the best I knew.
    I always test code to scale.  Glad to see this new method for generating numbers on the fly.
    And it's a good tip about INLINE table valued functions.

  • Eirikur Eiriksson - Sunday, July 1, 2018 1:12 AM

    Cannot see any reason for scanning the table to produce the week numbers when it can be done more efficiently without it.
    😎

    Here is a far better method with only two constant scans and no sort operator.


    USE TEEST;
    GO
    SET NOCOUNT ON;

    ;WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (SELECT TOP (52) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N
         FROM T T1,T T2)
    SELECT
      NM.N AS WEEK_NUM
    FROM  NUMS  NM
    ORDER BY NM.N;

    Eirikur, my only criticism of your code is the beginninator you have starting the CTE.  Why? the previous statement is terminated by a semicolon so there is no need for the extraneous semicolon prior to the WITH for the CTE.  Sorry, just a really big pet peeve of mine.  Plus, it really looks weird seeing a statement starting with and ending with a semicolon.

  • Lynn Pettis - Thursday, August 9, 2018 9:18 AM

    Eirikur Eiriksson - Sunday, July 1, 2018 1:12 AM

    Cannot see any reason for scanning the table to produce the week numbers when it can be done more efficiently without it.
    😎

    Here is a far better method with only two constant scans and no sort operator.


    USE TEEST;
    GO
    SET NOCOUNT ON;

    ;WITH T(N) AS (SELECT X.N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0)) X(N))
    , NUMS(N) AS (SELECT TOP (52) ROW_NUMBER() OVER (ORDER BY @@VERSION) AS N
         FROM T T1,T T2)
    SELECT
      NM.N AS WEEK_NUM
    FROM  NUMS  NM
    ORDER BY NM.N;

    Eirikur, my only criticism of your code is the beginninator you have starting the CTE.  Why? the previous statement is terminated by a semicolon so there is no need for the extraneous semicolon prior to the WITH for the CTE.  Sorry, just a really big pet peeve of mine.  Plus, it really looks weird seeing a statement starting with and ending with a semicolon.

    Hi mate, the reason for me using the beginninator is because that way, I can copy and paste / snippet insert the code without worries, regardless of the previous statement being terminated or not.
    😎

  • BluePeter - Thursday, August 9, 2018 8:38 AM

    Jeff Moden - Thursday, August 9, 2018 8:22 AM

    BluePeter - Thursday, August 9, 2018 5:07 AM

    Eirikur Eiriksson - Thursday, August 9, 2018 2:47 AM

    Be careful here, the recursive CTE is much slower than the inline tally table CTE method I posted, strongly advice against using a recursive CTE for this purpose.
    😎

    Agreed, it doesn't scale upwards, 10,000 is slower with a recursive cte.
    But no difference for 52 (or 53)
    I'll keep yours in mind, never seen it before.

    The reason you think there is no difference is because of what you're measuring and the way your measuring it.  There's another thing to consider, as well...

    If you practice the wrong way just because of low row counts, you'll never get good at the right way.  You also don't have to actually build the cCTE method (Cascading CTEs rather than Recursive CTEs) each and every time you want a sequence of numbers.  You can easily build an iTVF (Inline Table Valued Function) that will do that for you and, unlike scalar functions and mTVFs (Mult-statment Table Valued Functions), provides no additional hit if you refer to it within other iTVFs.

    I get that Jeff...  Thanx.
    My original point was not to rely on having rows in a User table.  And the recursive cte method was the best I knew.
    I always test code to scale.  Glad to see this new method for generating numbers on the fly.
    And it's a good tip about INLINE table valued functions.

    Awesome.  Thanks for the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply