Another Way to Look at Trees

  • Jeff Moden

    SSC Guru

    Points: 996831

    g.britton (9/15/2014)


    It is dyi version of a recvursive cte but avoids the rbar problem and potential exponential runtime

    I love all articles on hierarchies. Thanks for taking the time to post your thoughts and methods.

    Passing it forward and since you brought up the term "RBAR", have a look at the following articles for a differnent take on not only how to calculate levels, but also to use them to an extreme. The cool part is that you don't need to actually know or add data to a table to figure it all out. You only need to know who's the boss of each employee. I don't know how many rows your particlular example was for but the methods build the entire Nested Sets in less than a minute. From there, the sky's almost the limit. The second article tells of a different color sky.

    http://www.sqlservercentral.com/articles/Hierarchy/94040/

    http://www.sqlservercentral.com/articles/T-SQL/94570/

    You might be able to fold some of the techniques into your own.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Kev Riley

    SSCrazy Eights

    Points: 9026

    I'm glad Jeff Moden has pitched in here.

    I've been using a lot of hierarchical data in a current project, and Jeff's articles have helped immensely. I really do recommend reading them and taking some time to understand what is going on in the code.

    Larry - you mention processing a 9,000 node hierarchy in only 8 secs. I think you'll find if you adopt some of Jeff's techniques you'#ll be measuring that in milliseconds. I have one hierarchy of 92,000 nodes that I can process on the fly in under 1 second, and that's taking it from an unstructured format into and Adjacency List and then into a Nested Sets format.

  • fregatepllada

    SSCommitted

    Points: 1648

    Del

  • fregatepllada

    SSCommitted

    Points: 1648

    Jeff Moden (9/15/2014)


    fregatepllada (9/14/2014)


    Larry - are you aware that since SQL Server 2008 R2 there is a data type HierarchyID?

    If you using anything older than that version there is a Joe Celco book 100% dedicated to the

    topic of hierarchies, trees and graphs 😉

    I'm curious... are you using the HierarchyID datatype for anything now?

    No - just in cases when number of levels is UNKNOWN in design time!

    I got your irony - let's just for FUN implement a WHOLE db in a single table:

    - Identifier (Sequential GUID - PK and replication ID in the same column);

    - xml to store ANY entity (could be even strongly typed by XSD collection);

    - HierarchyID to set ALL relationships;

    This is much cooler than EAV anti-pattern: a minimalistic Zen approach to database design!

  • Jeff Moden

    SSC Guru

    Points: 996831

    fregatepllada (9/16/2014)


    Jeff Moden (9/15/2014)


    fregatepllada (9/14/2014)


    Larry - are you aware that since SQL Server 2008 R2 there is a data type HierarchyID?

    If you using anything older than that version there is a Joe Celco book 100% dedicated to the

    topic of hierarchies, trees and graphs 😉

    I'm curious... are you using the HierarchyID datatype for anything now?

    No - just in cases when number of levels is UNKNOWN in design time!

    I got your irony - let's just for FUN implement a WHOLE db in a single table:

    - Identifier (Sequential GUID - PK and replication ID in the same column);

    - xml to store ANY entity (could be even strongly typed by XSD collection);

    - HierarchyID to set ALL relationships;

    This is much cooler than EAV anti-pattern: a minimalistic Zen approach to database design!

    Heh... no. No irony intended. I've just run into a lot of folks that either directly or indirectly recommend the HierarchyID and then I find out they've not actually used it.

    If you're not one of those well meaning but inexperienced folks and you actually are using the HierarchyID data-type, then I'm very interested in knowing how many nodes your hierarchy has, what you use the hierarchy for, how you indexed it for depth and breadth, and whether or not you've experienced any problems with it. I ask because I really do want to know. I'm one of those (perhaps very wrongly so) that has looked at what it is and how it's intended to work and don't care for it based on (again, perhaps incorrectly) those things and some rumblings from folks that have had performance problems with it. The implementation that those folks designed may have been incorrect (such as forgetting the proper indexes), as well.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden

    SSC Guru

    Points: 996831

    kevriley (9/16/2014)


    I'm glad Jeff Moden has pitched in here.

    I've been using a lot of hierarchical data in a current project, and Jeff's articles have helped immensely. I really do recommend reading them and taking some time to understand what is going on in the code.

    Larry - you mention processing a 9,000 node hierarchy in only 8 secs. I think you'll find if you adopt some of Jeff's techniques you'#ll be measuring that in milliseconds. I have one hierarchy of 92,000 nodes that I can process on the fly in under 1 second, and that's taking it from an unstructured format into and Adjacency List and then into a Nested Sets format.

    Glad to see you "over here", Kev. Thank you very much for the feedback and very happy that the conversion is working well for you.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • fregatepllada

    SSCommitted

    Points: 1648

    Jeff - from business prospective I have to deal with situations when a Customer of on-line Assessment system could bring their own definition of their curriculum (and trust me each of Registered Training Organizations, Universities and Colleges has their own unique way ;-)), so we do not know upfront how many levels they would have (For instance it could be Courses, Subjects, Units, Skills, Competencies etc.).

    By some other reasons (that escapes me) business decided to squeeze into this hierarchical entity something else - for instance Program/Project/Milestones hierarchy.

    As a result I have to build a generic "infrastructure" that allow client in run-time to set up a brand new hierarchy type as a template, select existing template and just use it.

    Hope that my explanations did not totally confuse you

  • Jeff Moden

    SSC Guru

    Points: 996831

    fregatepllada (9/24/2014)


    Jeff - from business prospective I have to deal with situations when a Customer of on-line Assessment system could bring their own definition of their curriculum (and trust me each of Registered Training Organizations, Universities and Colleges has their own unique way ;-)), so we do not know upfront how many levels they would have (For instance it could be Courses, Subjects, Units, Skills, Competencies etc.).

    By some other reasons (that escapes me) business decided to squeeze into this hierarchical entity something else - for instance Program/Project/Milestones hierarchy.

    As a result I have to build a generic "infrastructure" that allow client in run-time to set up a brand new hierarchy type as a template, select existing template and just use it.

    Hope that my explanations did not totally confuse you

    Understood but, for Adjacency List hierarchies, you don't need to know how many levels there are. They can be calculated and, to save on time during lookups, stored.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply