Hierarchical Query to Return All Children

  • Hi All

    Need a bit of help on this

    I'm having some difficulty with flattening a table into its hierarchical structure. I've been trying to do this using a recursive CTE but I can't seem to get it right..

    Example data:

    DECLARE @TEMP AS TABLE

    (Orgunitcode INT, Childorgunitcode INT)

    INSERT INTO @TEMP

    SELECT 1123, 1256 UNION ALL

    SELECT 1256, 1345 UNION ALL

    SELECT 1345, 1489

    Required output:

    Org unit code children

    1123 1256

    1123 1345

    1123 1489

    1256 1345

    1256 1489

    1345 1489

    As you can see I need the query to return all children, as 1123 is the top most level 1256, 1345 and 1489 are all children of this Orgunit

    My code so far:

    WITH RecursiveCTE (Child, Parent, RecLevel) AS

    (

    --initialization

    SELECT

    Childorgunitcode

    ,Orgunitcode

    ,1 AS RecLevel

    FROM @TEMP T1

    UNION ALL

    --recursive execution

    SELECT

    T.Childorgunitcode

    ,T.Orgunitcode

    ,RecursiveCTE.RecLevel + 1

    FROM

    @TEMP T

    INNER JOIN RecursiveCTE

    ON T.Orgunitcode = RecursiveCTE.Child

    )

    SELECT

    Parent

    ,Child

    FROM

    RecursiveCTE A

    Some help would be appreciated!

    This doesn't have to be a recursive query, I know that they are not the best performance wise so I'm certainly open to more elegant / creative solutions!

    Thanks All

    Andy

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • The CTE method I could come up with

    ; WITH cte_Recursive( Parent, Child, RN, Level ) AS

    (

    SELECTT.Orgunitcode, T.Childorgunitcode, ROW_NUMBER() OVER ( ORDER BY Orgunitcode ) AS RN, 1 AS Level

    FROM@TEMP AS T

    WHERENOT EXISTS( SELECT * FROM @TEMP AS TI WHERE T.Childorgunitcode = TI.Orgunitcode )

    UNION ALL

    SELECTTR.Orgunitcode, TR.Childorgunitcode, CR.RN, Level + 1

    FROM@TEMP AS TR

    INNER JOINcte_Recursive CR ON TR.Childorgunitcode = CR.Parent

    )

    SELECTCR.Parent, iTVF.Child

    FROMcte_Recursive CR

    CROSS APPLY(

    SELECTCRI.Parent, CRI.Child

    FROMcte_Recursive CRI

    WHERECR.RN = CRI.RN AND CR.Level >= CRI.Level

    ) iTVF

    ORDER BY CR.Parent, iTVF.Child


    Kingston Dhasian

    How to post data/code on a forum to get the best help - Jeff Moden
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • In your code change

    --recursive execution

    SELECT

    T.Childorgunitcode

    ,T.Orgunitcode

    ,RecursiveCTE.RecLevel + 1

    to

    --recursive execution

    SELECT

    T.Childorgunitcode

    ,RecursiveCTE.Parent

    ,RecursiveCTE.RecLevel + 1

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Both, thank you for the replies very helpful

    I'm interested (just out of curiosity) if there are any other ways to achieve the same result without using a recursive query if anyone has any thoughts?

    Thanks again

    Andy

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • Since it's an adjacency hierarchy (ID and ParentID hierarchy), you'll have to crawl it recursively by some means or another. A recursive CTE is just as good as any other method.

    If you want to get away from recursive hierarchy crawling, you have to go with either a nested sets hierarchy, or use the SQL 2008 (or later) HierarchyID datatype, or a path hierarchy.

    ID and ParentID is adjacency, and requires recursive querying.

    Nested Sets uses ranges.

    HierarchyID is a CLR datatype that has methods and properties available.

    Path Hierarchies would have things like the parent node has an "A" in the column, and every node below it in the hierarchy starts with "A". "AA" and "AB" would be children of "A", while "ABA", "ABB", "ABC" would be children of "AB" and grand-children of "A". The path elements can be separated by punctuation, "A.A", "A.B.A", or not. No effect on how you query it either way. Doesn't have to use letters, and can have more than one character for an element if the levels are bigger than 26 possible nodes.

    The others, Adjacency, Nested Sets, and HierarchyID, can be looked up online with Bing/Google/whatever.

    The main advantage of the ones that don't use adjacency is that they don't have to be queried recursively. That makes them much, much faster to query. Usually slower to update/insert/delete, but faster to select. I've done tests where a 100-thousand-row, 6-level-deep adjacency query took over 8 minutes to query, while the same hierarchy done as nested sets took just about .01 seconds to query. On the same hardware, etc.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • If you want to get away from recursive hierarchy crawling, you have to go with either a nested sets hierarchy, or use the SQL 2008 (or later) HierarchyID datatype, or a path hierarchy.

    Thanks Gus I've looked at this and is very interesting reading..

    However the back end DB is a clients database and I can't make structural changes at this time..

    I have a small problem, after executing this query I hit the max recursion threshold of 32767, I've managed to overcome this using a "current flag" however I'm concerned that in the future more data could cause a problem..

    So if anyone has a method that I can use to overcome this and future proof the code suggestions would be gratefully received..

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • @Andy,

    If you're sure there are no "loops" within the hierarchy, you could set the MAX Recursion Limit to "0" to be unlimited. I am, however, concerned that you may actually have some "loops" in the hierarchy. The parts list for a Boeing 747 is only 18 levels deep and the hierarcy for one of the larger MLMs in the world is only sever thousand deep.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If you're running into the recursion limit like that, you probably have an infinite loop going, as Jeff mentioned.

    To get around that, without structural changes to the table (since you can't make those), I suggest avoiding a recursive CTE and moving to a looping UDF.

    Declare a table variable, or create a temp table, and insert the first set of rows into it (might just be one top row, might be more). Loop like:

    while @@rowcount > 0

    insert into #nodes (ID)

    select ID

    from dbo.MyHierarchyTable

    where ParentID in (select ID from #nodes)

    and ID not in (select ID from #nodes);

    The performance will suffer, possibly a little, possibly a lot, but it will not loop indefinitely, because of the Not In statement. If the ID column can be null, you need to include a "and ID is not null" in the Where clause, because it will otherwise cause weird behavior in the Not In statement.

    Try that, see what that does. Even if you don't end up using it in production code, it can be used to help you find loops so you can correct your data.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • @Andy,

    If you're sure there are no "loops" within the hierarchy, you could set the MAX Recursion Limit to "0" to be unlimited. I am, however, concerned that you may actually have some "loops" in the hierarchy. The parts list for a Boeing 747 is only 18 levels deep and the hierarcy for one of the larger MLMs in the world is only sever thousand deep.

    That is a good point Jeff, I will check this, the data is external so it is possible there is some "bad data" in there somewhere, I'll take this on board (no pun intended ;-)) and check

    Google the Nested sets model. It is so much easier than the non-relational adjacency list model you have.

    Thanks Mr C, Gus kindly pointed me in this direction as well and makes sense but as I said this is not my database and (at the moment anyway) it has to remain as my client has constructed therefore no changes are possible..

    @Gus

    Thanks for the code and suggestion if we hit trouble down the line then I'll certainly take a look at this (maybe sooner rather than later depending on data volumes!)

    All as always you are a great help and thanks 🙂

    Andy

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • P.S. A While Loop will actually beat the rCTE for performance here. It's not substantial enough to necessarily warrent its use but I thought I'd bring it up.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden P.S. A While Loop will actually beat the rCTE for performance here. It's not substantial enough to necessarily warrent its use but I thought I'd bring it up.

    I thought I'd read that somewhere (actually Jeff I think it was one of your articles or maybe a post) How would you write a While Loop for something like this?

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • Andy Hyslop (7/16/2012)


    Jeff Moden P.S. A While Loop will actually beat the rCTE for performance here. It's not substantial enough to necessarily warrent its use but I thought I'd bring it up.

    I thought I'd read that somewhere (actually Jeff I think it was one of your articles or maybe a post) How would you write a While Loop for something like this?

    He was talking about the last solution I posted. That's a while-loop version.

    In the speed tests I've done with rCTE vs While loops, they've been similar enough in performance that I think they use the same binaries behind the scenes in the SQL Server engine. Could easily be wrong about that, but it looks that way in the tests I've done.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (7/17/2012)


    He was talking about the last solution I posted. That's a while-loop version.

    First, apologies for not responding sooner. I've actually be working on a demonstration of what I meant and haven't completed it yet.

    Shifting gears to the comment above... Yes and no. Yes, I was talking about a While Loop. No, I wasn't talking about Gus' code. 🙂

    Neither the rCTE nor Gus' code actually tells you where the infinite loop is. Gus has a nice bypass in his code for such anomolies but I'm not sure that you want a bypass. I'd think that you'd want to identify the problem in the hierarchy and fix it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • First, apologies for not responding sooner. I've actually be working on a demonstration of what I meant and haven't completed it yet.

    No problem at all Jeff! 🙂

    Yes as you suggest I would be interested in seeing your demo when it's finished, as I would like to identify if there is a problem in the hierarchy.

    As I mentioned previously the code is working but I am concerned that it may become a problem later on down the line if more bad data is inserted. I do have Gus to thank for a workaround but am always interested in different approaches!

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • Andy Hyslop (7/18/2012)


    First, apologies for not responding sooner. I've actually be working on a demonstration of what I meant and haven't completed it yet.

    No problem at all Jeff! 🙂

    Yes as you suggest I would be interested in seeing your demo when it's finished, as I would like to identify if there is a problem in the hierarchy.

    As I mentioned previously the code is working but I am concerned that it may become a problem later on down the line if more bad data is inserted. I do have Gus to thank for a workaround but am always interested in different approaches!

    Be very carefull. The problem with the workaround is that you could end up missing some very important information. If you have, let's say, a downline chain of 20 items and the infinite loop in the hierarchy appears just 3 items down, then you'll miss that item and 17 other items in the downline chain. Downline chains can become quite lengthy (downline chain for the root node is all the nodes in the hierarchy) so you could end up not processing hundreds or even thousands of nodes on larger hierarchies.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 36 total)

You must be logged in to reply to this topic. Login to reply