CTE - How to surpass the limit

  • Hi there,

    I've seen most people here using CTE but... does it really have a limit?

    WITH CTE (ctr1) AS (

    SELECT 1

    UNION ALL

    SELECT ctr1+1

    FROM CTE

    WHERE ctr1<32767

    )

    SELECT DISTINCT ctr1 FROM CTE c1

    by default, it has a limit of 100

    using OPTION (MAXRECURSION 32767) you can limit it up to 32767 or any lower number you specified. (32768 and above is not allowed, you can try it if you want?)

    WITH CTE (ctr1) AS (

    SELECT 1

    UNION ALL

    SELECT ctr1+1

    FROM CTE

    WHERE ctr1<32767

    )

    SELECT DISTINCT ctr1 FROM CTE c1

    OPTION (MAXRECURSION 32767)

    can CTE surpass that 32767 limit?

    NOTE: these are just simple examples, I know there are other ways to generate numbers, I just wanna know how to pass that 32767 limit

    Thanks in advance guys. ^__^

    _____________________________________________
    [font="Comic Sans MS"]Quatrei Quorizawa[/font]
    :):D:P;):w00t::cool::hehe:
    MABUHAY PHILIPPINES!

    "Press any key...
    Where the heck is the any key?
    hmmm... Let's see... there's ESC, CTRL, Page Up...
    but no any key"
    - Homer Simpson
  • Quatrei.X (8/26/2010)


    can CTE surpass that 32767 limit?

    No. And there's no need for it to either. Generally, if you've got a CTE that exceeds 100, then you've got a circular reference so it's your SQL that needs correcting.

    -EDIT-

    :blush: Whoopsie, I'm wrong


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Yes. You can surpass the limit by setting the value for MAXRECURSION as 0.

    OPTION (MAXRECURSION 0)

    But always make sure you don't go into an infinite loop by using this option.


    Kingston Dhasian

    How to post data/code on a forum to get the best help - Jeff Moden
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • hmmmm...

    :hehe: I was planning to create a generic function which gets a comma separated list of words in a string and use CTE to seperate those strings to table rows.

    🙂 currently my function is working fine but, if records grow, my function might malfunction in the future.

    😀 I know how to do this with loops or XML, but seeing CTE as a much faster option, I might make good use of it... only if it has no limit.

    _____________________________________________
    [font="Comic Sans MS"]Quatrei Quorizawa[/font]
    :):D:P;):w00t::cool::hehe:
    MABUHAY PHILIPPINES!

    "Press any key...
    Where the heck is the any key?
    hmmm... Let's see... there's ESC, CTRL, Page Up...
    but no any key"
    - Homer Simpson
  • Kingston Dhasian (8/26/2010)


    Yes. You can surpass the limit by setting the value for MAXRECURSION as 0.

    OPTION (MAXRECURSION 0)

    But always make sure you don't go into an infinite loop by using this option.

    :w00t: Thanks man!

    hehehe, funny, using zero came to my mind earlier but thought it was not worth trying... guess we should really try and try until we die :hehe:

    _____________________________________________
    [font="Comic Sans MS"]Quatrei Quorizawa[/font]
    :):D:P;):w00t::cool::hehe:
    MABUHAY PHILIPPINES!

    "Press any key...
    Where the heck is the any key?
    hmmm... Let's see... there's ESC, CTRL, Page Up...
    but no any key"
    - Homer Simpson
  • Quatrei.X (8/26/2010)


    hmmmm...

    :hehe: I was planning to create a generic function which gets a comma separated list of words in a string and use CTE to seperate those strings to table rows.

    🙂 currently my function is working fine but, if records grow, my function might malfunction in the future.

    😀 I know how to do this with loops or XML, but seeing CTE as a much faster option, I might make good use of it... only if it has no limit.

    CTE is not really a faster option it is still looping and it will grind to a halt as more data is processed, XML would probably be the best out of those options.

    However a much better way is to use a tally table and use set-based logic

    http://www.sqlservercentral.com/articles/T-SQL/62867/"> http://www.sqlservercentral.com/articles/T-SQL/62867/

  • Quatrei.X (8/26/2010)


    hmmmm...

    :hehe: I was planning to create a generic function which gets a comma separated list of words in a string and use CTE to seperate those strings to table rows.

    🙂 currently my function is working fine but, if records grow, my function might malfunction in the future.

    😀 I know how to do this with loops or XML, but seeing CTE as a much faster option, I might make good use of it... only if it has no limit.

    Have a look at the link below. This is one of the best delimited string split functions available

    http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx#bm944589


    Kingston Dhasian

    How to post data/code on a forum to get the best help - Jeff Moden
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • :w00t: First of all, thanks kingston, steveb and skcadavre

    I saw the links using tally via table and CTE

    I have 2 follow up questions.

    Question 1:

    Use tally table or CTE... which is a better practice? (although I'm already thinking of CTE but still wanna hear from you guys)

    Question 2:

    I have seen Jeff Moden's dbo.DelimitedSplit8K function sample but why do "multiple CTE's" (not sure what their called... anyways, lets look at these two examples.

    Jeff Moden's CTE in his dbo.DelimitedSplit8K

    WITH

    E1(N) AS ( --=== Create Ten 1s

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 --10

    ),

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000

    cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)

    SELECT N FROM cteTally

    The CTE I was using to generate numbers

    ;WITH CTE (ctr1) AS (

    SELECT 1

    UNION ALL

    SELECT ctr1+1 FROM CTE WHERE ctr1<10000

    )

    SELECT DISTINCT ctr1 FROM CTE c1

    OPTION (MAXRECURSION 10000)

    both have the same records

    I tried running both codes in one script...

    Query costs relative to batch is 84% for the first one and 16% for the second. I'm just wondering why Jeff did this approach. Hope he also reads this. (I'll just a post this link there to avoid double posting.)

    btw, both runs less than a second, I just wanna gain more knowledge hehehe. :hehe:

    _____________________________________________
    [font="Comic Sans MS"]Quatrei Quorizawa[/font]
    :):D:P;):w00t::cool::hehe:
    MABUHAY PHILIPPINES!

    "Press any key...
    Where the heck is the any key?
    hmmm... Let's see... there's ESC, CTRL, Page Up...
    but no any key"
    - Homer Simpson
  • Quatrei.X (8/27/2010)


    Question 1:

    Use tally table or CTE... which is a better practice? (although I'm already thinking of CTE but still wanna hear from you guys)

    Jeff Moden's CTE in his dbo.DelimitedSplit8K

    WITH

    E1(N) AS ( --=== Create Ten 1s

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 --10

    ),

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000

    cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)

    SELECT N FROM cteTally

    The CTE I was using to generate numbers

    ;WITH CTE (ctr1) AS (

    SELECT 1

    UNION ALL

    SELECT ctr1+1 FROM CTE WHERE ctr1<10000

    )

    SELECT DISTINCT ctr1 FROM CTE c1

    OPTION (MAXRECURSION 10000)

    Jeff's Tally CTE beats your Recursive CTE BY quite a margin, Quatrei.. Here are my test rig results after five trials running both of the above queries in separate windows.

    -- Jeff Moden's Tally:

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 5 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 4 ms.

    -- Quatrie's CTE

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 151 ms.

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 154 ms.

    SQL Server Execution Times:

    CPU time = 125 ms, elapsed time = 156 ms.

    SQL Server Execution Times:

    CPU time = 141 ms, elapsed time = 150 ms.

    SQL Server Execution Times:

    CPU time = 140 ms, elapsed time = 149 ms.

    Look at the CPU time the recursive CTE takes ; this is just for 10000 rows, what if you are generating > 1 million rows..hmmm....

    both have the same records

    I tried running both codes in one script...

    Query costs relative to batch is 84% for the first one and 16% for the second. I'm just wondering why Jeff did this approach. Hope he also reads this. (I'll just a post this link there to avoid double posting.)

    Actual Execution Plan will "also" , at times, lie...There are lot of things the code you posted do.. the codes will produce the results in the "Results" pane, so that will also be included in the query cost..hmmm...Hope this clears the air..

  • Quatrei.X (8/27/2010)


    Query costs relative to batch is 84% for the first one and 16% for the second.

    Those costs are estimates. There are lots of things that make the estimates wrong, one being an inaccurate row count estimate. If you look at more than just the cost estimate on the exec plans, you'll notice that the row count estimations of the CTE are completely inaccurate.

    Estimated rows: 2

    Actual rows: 10000

    That's going to really mess up those costs and as a results the estimates.

    I need to do more reading and investigation but, from what I remember, a recursive CTE is typically costed very badly estimated by the optimiser.

    I'm just wondering why Jeff did this approach.

    Because it's incredibly fast. Also that approach is widely used. If I recall, I've seen Itzik Ben-Gan use it, so it's not just Jeff.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi there,

    Here's another method for splitting comma separated values using recursive cte without using tally table.. When the string contains more than 8000 characters, this performs quite faster (otherwise DelimitedSplit8K function is better)..:-)

    CREATE FUNCTION dbo.Split (@string VARCHAR(MAX), @delimiter CHAR(1))

    RETURNS TABLE

    AS

    RETURN (

    WITH Temp(id, position1, position2)

    AS (

    SELECT1, CAST(1 AS INT), CAST(CHARINDEX(@delimiter, @string) AS INT)

    UNION ALL

    SELECTid + 1, CAST(position2 AS INT) + 1, CAST(CHARINDEX(@delimiter, @string, position2 + 1) AS INT)

    FROMTemp

    WHEREposition2 > 0

    )

    SELECT id,SUBSTRING(@string, position1, CASE WHEN position2 = 0 THEN LEN(@string) + 1 - position1 ELSE position2 - position1 END) AS word

    FROM Temp

    )

    GO

    DECLARE @string VARCHAR(MAX)

    SELECT @string = STUFF((SELECT ',' + CAST(ABS(CHECKSUM(NEWID())%10000) AS VARCHAR) AS [text()]

    FROM sys.all_columns a--, sys.all_columns b

    FOR XML PATH('')),1,1,'')

    SELECT LEN(@string)

    SET STATISTICS TIME ON

    SELECT * FROM dbo.Split(@string,',') OPTION(MAXRECURSION 0)

    SET STATISTICS TIME OFF

    DROP FUNCTION dbo.Split

    I tried modifying DelimitedSplit8K function to accept varchar(max) (although it's already noted there that varchar(max) would affect its performance) and increase the tally table (just for testing :-D).. The rows generated are incomplete yet because I was not able to generate enough numbers for tally table but Split function is 2 seconds faster.. :w00t:

    I have also seen Oleg's procedure in another post using XML and include it in this testing.. It's cool and it's faster! Thanks for your procedure! 🙂

    Overall, this is the execution summary on my machine: (character length generated=1193011; expected rows=244036)

    --DelimitedSplit8K function

    (204542 row(s) affected)

    SQL Server Execution Times:

    CPU time = 12890 ms, elapsed time = 13087 ms.

    --Split function

    (244036 row(s) affected)

    SQL Server Execution Times:

    CPU time = 9984 ms, elapsed time = 10575 ms.

    --Oleg's usp_DelimitedSplit stored procedure

    SQL Server parse and compile time:

    CPU time = 54 ms, elapsed time = 54 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 34 ms.

    SQL Server Execution Times:

    CPU time = 3891 ms, elapsed time = 3881 ms.

    (244036 row(s) affected)

    SQL Server Execution Times:

    CPU time = 2953 ms, elapsed time = 3170 ms.

    SQL Server Execution Times:

    CPU time = 6938 ms, elapsed time = 7145 ms.

    When I tried to create inline function for Oleg's stored procedure, the query becomes slower.. I used derived tables and cross apply though.. Did I miss something?

    "Actual Execution Plan will "also" , at times, lie".. nice one ColdCoffee! Execution time is the best metric to measure performance..

    I hope this helps! 😀

  • KurtdTajsadz (8/27/2010)


    hi, i am Ronald Powel

    thanks all my friends

    Reported as spam.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • shield_21 (8/27/2010)


    Hi there,

    Here's another method for splitting comma separated values using recursive cte without using tally table.. When the string contains more than 8000 characters, this performs quite faster (otherwise DelimitedSplit8K function is better)..:-)

    CREATE FUNCTION dbo.Split (@string VARCHAR(MAX), @delimiter CHAR(1))

    RETURNS TABLE

    AS

    RETURN (

    WITH Temp(id, position1, position2)

    AS (

    SELECT1, CAST(1 AS INT), CAST(CHARINDEX(@delimiter, @string) AS INT)

    UNION ALL

    SELECTid + 1, CAST(position2 AS INT) + 1, CAST(CHARINDEX(@delimiter, @string, position2 + 1) AS INT)

    FROMTemp

    WHEREposition2 > 0

    )

    SELECT id,SUBSTRING(@string, position1, CASE WHEN position2 = 0 THEN LEN(@string) + 1 - position1 ELSE position2 - position1 END) AS word

    FROM Temp

    )

    GO

    DECLARE @string VARCHAR(MAX)

    SELECT @string = STUFF((SELECT ',' + CAST(ABS(CHECKSUM(NEWID())%10000) AS VARCHAR) AS [text()]

    FROM sys.all_columns a--, sys.all_columns b

    FOR XML PATH('')),1,1,'')

    SELECT LEN(@string)

    SET STATISTICS TIME ON

    SELECT * FROM dbo.Split(@string,',') OPTION(MAXRECURSION 0)

    SET STATISTICS TIME OFF

    DROP FUNCTION dbo.Split

    I tried modifying DelimitedSplit8K function to accept varchar(max) (although it's already noted there that varchar(max) would affect its performance) and increase the tally table (just for testing :-D).. The rows generated are incomplete yet because I was not able to generate enough numbers for tally table but Split function is 2 seconds faster.. :w00t:

    I have also seen Oleg's procedure in another post using XML and include it in this testing.. It's cool and it's faster! Thanks for your procedure! 🙂

    Overall, this is the execution summary on my machine: (character length generated=1193011; expected rows=244036)

    --DelimitedSplit8K function

    (204542 row(s) affected)

    SQL Server Execution Times:

    CPU time = 12890 ms, elapsed time = 13087 ms.

    --Split function

    (244036 row(s) affected)

    SQL Server Execution Times:

    CPU time = 9984 ms, elapsed time = 10575 ms.

    --Oleg's usp_DelimitedSplit stored procedure

    SQL Server parse and compile time:

    CPU time = 54 ms, elapsed time = 54 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 0 ms.

    SQL Server Execution Times:

    CPU time = 31 ms, elapsed time = 34 ms.

    SQL Server Execution Times:

    CPU time = 3891 ms, elapsed time = 3881 ms.

    (244036 row(s) affected)

    SQL Server Execution Times:

    CPU time = 2953 ms, elapsed time = 3170 ms.

    SQL Server Execution Times:

    CPU time = 6938 ms, elapsed time = 7145 ms.

    When I tried to create inline function for Oleg's stored procedure, the query becomes slower.. I used derived tables and cross apply though.. Did I miss something?

    "Actual Execution Plan will "also" , at times, lie".. nice one ColdCoffee! Execution time is the best metric to measure performance..

    I hope this helps! 😀

    I'm testing Oleg's code method. The place that most XML methods fall apart in is when there are multiple rows. They do work fast as all get out on a single row, though.

    Speaking of testing, if you're going to make claims of speed testing, would you mind posting the code you used to generate the test data and your full test harness so we can verify your findings? Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • GilaMonster (8/27/2010)


    Quatrei.X (8/27/2010)


    Query costs relative to batch is 84% for the first one and 16% for the second.

    Those costs are estimates. There are lots of things that make the estimates wrong, one being an inaccurate row count estimate. If you look at more than just the cost estimate on the exec plans, you'll notice that the row count estimations of the CTE are completely inaccurate.

    Estimated rows: 2

    Actual rows: 10000

    That's going to really mess up those costs and as a results the estimates.

    I need to do more reading and investigation but, from what I remember, a recursive CTE is typically costed very badly estimated by the optimiser.

    I'm just wondering why Jeff did this approach.

    Because it's incredibly fast. Also that approach is widely used. If I recall, I've seen Itzik Ben-Gan use it, so it's not just Jeff.

    Gail is correct. If you look at the comments in the header you'll see that the code is a result of a whole lot of folks input (including Ben-Gan's) over a whole lot of time.

    Gail is also spot on about recursive CTE's. The only thing that get's measured for the estimated and actual execution plans in a recursive CTE is the first iteration. That's what makes it look so fast. The plan is for only one iteration.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ColdCoffee (8/27/2010)


    Quatrei.X (8/27/2010)


    Question 1:

    Use tally table or CTE... which is a better practice? (although I'm already thinking of CTE but still wanna hear from you guys)

    Jeff Moden's CTE in his dbo.DelimitedSplit8K

    WITH

    E1(N) AS ( --=== Create Ten 1s

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 --10

    ),

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000

    cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)

    SELECT N FROM cteTally

    The CTE I was using to generate numbers

    ;WITH CTE (ctr1) AS (

    SELECT 1

    UNION ALL

    SELECT ctr1+1 FROM CTE WHERE ctr1<10000

    )

    SELECT DISTINCT ctr1 FROM CTE c1

    OPTION (MAXRECURSION 10000)

    Jeff's Tally CTE beats your Recursive CTE BY quite a margin, Quatrei.. Here are my test rig results after five trials running both of the above queries in separate windows.

    -- Jeff Moden's Tally:

    SQL Server Execution Times:

    CPU time = 16 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 15 ms, elapsed time = 5 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 0 ms, elapsed time = 4 ms.

    -- Quatrie's CTE

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 151 ms.

    SQL Server Execution Times:

    CPU time = 156 ms, elapsed time = 154 ms.

    SQL Server Execution Times:

    CPU time = 125 ms, elapsed time = 156 ms.

    SQL Server Execution Times:

    CPU time = 141 ms, elapsed time = 150 ms.

    SQL Server Execution Times:

    CPU time = 140 ms, elapsed time = 149 ms.

    Look at the CPU time the recursive CTE takes ; this is just for 10000 rows, what if you are generating > 1 million rows..hmmm....

    both have the same records

    I tried running both codes in one script...

    Query costs relative to batch is 84% for the first one and 16% for the second. I'm just wondering why Jeff did this approach. Hope he also reads this. (I'll just a post this link there to avoid double posting.)

    Actual Execution Plan will "also" , at times, lie...There are lot of things the code you posted do.. the codes will produce the results in the "Results" pane, so that will also be included in the query cost..hmmm...Hope this clears the air..

    Exactly... now take those time differences and multiply their usage by something as simple as 10,000 rows (a very small "batch", actually)... the differece between taking 15ms and 150ms per row suddenly becomes the difference between 150 seconds (2.5 minutes) and 1,500 seconds (25 minutes).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 49 total)

You must be logged in to reply to this topic. Login to reply