Recursive CTE Vs Temp Table in Recursive User Defined Procedure

  • Hello Friends,

    I have fairly huge table having 1 million records. The table has a binary tree structure.

    I want to write a procedure for getting all the children records for a particular parent record.

    What will be more faster and efficient Guys. Recursive CTE or the use of temp table in recursive user defined Procedure???

    Thanks

    Bhavesh



    [font="System"]Bhavesh Patel[/font]

    http://bhaveshgpatel.wordpress.com/
  • bhavesh_183 (7/30/2009)


    Hello Friends,

    I have fairly huge table having 1 million records. The table has a binary tree structure.

    I want to write a procedure for getting all the children records for a particular parent record.

    What will be more faster and efficient Guys. Recursive CTE or the use of temp table in recursive user defined Procedure???

    Thanks

    Bhavesh

    You'll want to test it to be sure, but I'd say it would be the CTE. But there are alternatives that could be faster still. Do a search on the phrase "table of numbers" or "tally table." There are several solutions using this type of approach that can outperform recursive CTE's.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I have a huge table with 1 million records with binary tree structure:

    The simplified Table (tbl_tree1) with Binary Tree data is given below:

    ID Name ParentID Age

    1 A 0 21

    2 B 1 25

    3 C 1 25

    4 D 2 26

    5 E 2 27

    6 F 3 30

    7 G 3 19

    8 H 4 18

    9 I 4 24

    10 J 5 37

    11 K 5 26

    12 L 6 24

    13 M 6 28

    14 N 7 21

    15 O 7 20

    I have written a procedure using recursive CTE to find out all the the children of a given parent ID. But the problem with this procedure is it takes 2 hr 10 min to retrieve details of the parentID having 107891 children. This code is given below:

    ALTER PROCEDURE [dbo].[USP_BinaryTree]

    (

    @parentid INT

    )

    AS

    BEGIN

    SET NOCOUNT ON

    BEGIN TRY

    BEGIN

    WITH Tree_CTE AS

    (

    SELECTID, Name, ParentID, Age

    FROMtbl_tree1

    WHEREParentID=@parentid

    UNION ALL

    SELECTt.ID, t.Name, t.ParentID, t.Age

    FROMtbl_tree1 t

    INNER JOIN Tree_CTE tcte ON tcte.id = t.parentID

    )

    SELECT * FROM Tree_CTE ORDER BY ID

    END

    END TRY

    BEGIN CATCH

    PRINT @@ERROR

    END CATCH

    END

    Please can some one suggest more efficient and faster procedure coding for the same.

    Thanks

    Bhavesh



    [font="System"]Bhavesh Patel[/font]

    http://bhaveshgpatel.wordpress.com/
  • If the binary tree doesn't change very often (say, once a day), you can use a Nested Set model in parallel with your current Adjacency Model. Joe Celko has some pretty good articles on the Nested Set thing... it makes for SET BASED lookups for both the downline and upline that are faster than most can believe is possible with hierarchical data. I recommend you Google for "Nested Set Joe Celko".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff. I wasn't sure whether or not to suggest the nested set approach. I don't have any experience with it, but I knew it was a possibility.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Binary trees are also particularily good candidates for the "Path-Only" model.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Here's a link...

    http://www.intelligententerprise.com/001020/celko.jhtml

    Just be a little careful... one of the many articles on the Web for Nested Sets has two things to be slightly concerned with and I don't remember which one... one of them wipes out the adjacency model table as it builds the nested set and the other one needs a minor patch so that it will properly include the final node.

    I have used Nested Set models in parallel with an Adjacency Model. I did them in parallel because the data didn't change much, Adjacency Models are (somehow) easier for humans to think about, and the incredible speed advantages of the Nested Set Model are just incredible. And when I say I did them in parallel, I mean in the same table... I added the required left and right "bowers" (that's what I call them... old Navy term for a particular type of anchor) to the adjacency model table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here's a naive implementation of the Path-Only model with your data that demonstrates it's advantages:

    create table Binary_tree (PathKey Varchar(30) primary key,[Name] varchar(20),Age int);

    INSERT into binary_tree (PathKey, Name, Age)

    SELECT '','A',21

    UNION SELECT '0','B',25

    UNION SELECT '1','C',25

    UNION SELECT '00','D',26

    UNION SELECT '01','E',27

    UNION SELECT '10','F',30

    UNION SELECT '11','G',19

    UNION SELECT '000','H',18

    UNION SELECT '001','I',24

    UNION SELECT '010','J',37

    UNION SELECT '011','K',26

    UNION SELECT '100','L',24

    UNION SELECT '101','M',28

    UNION SELECT '110','N',21

    UNION SELECT '111','O',20

    --topologically sorted:

    Select * From binary_tree order by PathKey

    --all children of a node:

    Select * From binary_tree Where PathKey Like '1%'

    It's principle disadvantage is the size of the path key. If that becomes an issue, then there are ways (admittedly, obtuse ways) to compact it.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • There seems to be a performance threshold about 13 levels deep.

    See http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=129754


    N 56°04'39.16"
    E 12°55'05.25"

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply