Hierarchies on Steroids #2: A Replacement for Nested Sets Calculations

  • Jeff,

    Thank you soooo much! I have been working on this all week and have not been successful.

    Kathy

  • Hi Jeff! Have you had time to look at my problem?

    Thanks,

    Kathy Davis

  • Apologies. I really had to think about this.

    First, I guess I don't understand why you would need an order in the output. The output bears no resemblence to a graphical org chart. It's all vertical data that most users won't ever see in that form.

    The second thing is that if your REALLY need it in a given vertical order, then you can't use EmployeeIDs either as the child or parent ID. Instead, you'd need to predetermine what the order would be according to the graphical org chart and assign a position number to each node. The child and parent IDs would then be based on those positions and you would simply add the information for the employee holding that current position. The good part about that is that it becomes stupid simple to change someone in a position. The bad part is, you have to know what the org chart should look like so that you can assign the positions and the parents of each position.

    That's about the only way I can think of doing this without it being by EmployeeID or by Name.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Thank you so much for your assistance. I will use your advice!

    Kathy

  • Thanks for the feedback, Kathy. If you get a chance, stop back and let us know how it worked out. I think originally deciding how to create the positional ID notation might be a bit of a chore but then maintenance becomes easy after that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/15/2012)


    I hear ya!!! If the final PreaggregatedHierarchy table is the source of information for a 24/7/.999999999 web site or application, then you really can’t afford even the 62 seconds of downtime especially if you’re running a world wide Multi-Level Marketing company! You really don’t want to make any of the members angry with even a minute of “downtime” while you rebuild the table.

    So what do you do?

    It’s simple. Have a synonym (or pass-through view) pointing at the current version of the pre-aggregated hierarchy table while you build up a new version. Once completed, simply repoint the synonym or view to the new table and drop the old table (or keep it to do monthly comparisons for a month). The total “downtime” is now measured in milli-seconds.

    Next month, you do the same thing. Just keep “flip-flopping” the synonym or view between the two tables.

    You can even include the building of the Primary Key in an online fashion like this IF you let the system name the PKs for you instead of adding them as a “named constraint”. Adding the other indexes are a piece of cake because index names don’t need to be unique in a database like constraints need to be.

    Hi Jeff,

    I was wondering if you could elaborate on the above with some code examples or references.

    Now that I seem to understand your Hierarchies on Steroids after about 3 readings, I'm anxious to learn more.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thanks for the question, Dwain. It's real simple.

    Assuming that you already have a hierarchy table (We'll call it "H" for this example and all names will be greatly simplified. Substitute as your heart desires) that you want to update...

    1. Rename the hierarchy table from "H" to "H1".

    2. Create a synonym called "H" and point it at "H1". The "H" synonym takes the place of the table in all "user" code and procs without any changes.

    3. Create another hierarchy table called "H2" even if it's empty.

    4. Create another synonym called "HW" (hierarchy work) and point it at "H2".

    At this point in time, we have synonym "H" pointing at the currently "online" data in table "H1".

    At this point in time, we have synonym "HW" pointing at the currently "offline" table "H2".

    The time comes for a rebuild of the hierarchy. Here are the steps for that. The final result is that the hierarchical data was unavailable for only milliseconds even though the rebuild may have taken a minute.

    1. Run the code to rebuild the "H2" table through the "HW synonym. Except for the table truncate, the prevents the need for all other dynamic SQL for the rebuild.

    2. Once "H2" has been successfully rebuilt, simply repoint the "H" synonym at "H2", which contains the new current data to bring it all "online". Total downtime is whatever it takes to drop and rebuilt just the "H' synonym.

    3. Repoint the "HW" synonym to the now old data in "H1". You can either keep "H1" for "previous value" comparisons, or truncate it.

    4. Next update, you simply rebuild "H1" through the "HW" synonym, repoint the "H" synonym to the "H1" to bring the updated data online, and repoint the "HW" synonym to "H2" in preparation for the next rebuild.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/20/2015)


    Thanks for the question, Dwain. It's real simple.

    Assuming that you already have a hierarchy table (We'll call it "H" for this example and all names will be greatly simplified. Substitute as your heart desires) that you want to update...

    1. Rename the hierarchy table from "H" to "H1".

    2. Create a synonym called "H" and point it at "H1". The "H" synonym takes the place of the table in all "user" code and procs without any changes.

    3. Create another hierarchy table called "H2" even if it's empty.

    4. Create another synonym called "HW" (hierarchy work) and point it at "H2".

    At this point in time, we have synonym "H" pointing at the currently "online" data in table "H1".

    At this point in time, we have synonym "HW" pointing at the currently "offline" table "H2".

    The time comes for a rebuild of the hierarchy. Here are the steps for that. The final result is that the hierarchical data was unavailable for only milliseconds even though the rebuild may have taken a minute.

    1. Run the code to rebuild the "H2" table through the "HW synonym. Except for the table truncate, the prevents the need for all other dynamic SQL for the rebuild.

    2. Once "H2" has been successfully rebuilt, simply repoint the "H" synonym at "H2", which contains the new current data to bring it all "online". Total downtime is whatever it takes to drop and rebuilt just the "H' synonym.

    3. Repoint the "HW" synonym to the now old data in "H1". You can either keep "H1" for "previous value" comparisons, or truncate it.

    4. Next update, you simply rebuild "H1" through the "HW" synonym, repoint the "H" synonym to the "H1" to bring the updated data online, and repoint the "HW" synonym to "H2" in preparation for the next rebuild.

    Thanks for the explanation Jeff.

    When you reset the synonyms like this, does that cause cached query plans to be recompiled?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I've never had a performance problem because of such switches but if I were forced at gunpoint to make a conjecture, I would unfortunately still have to say I don't know because I've never tested for that.

    I can, however, tell you that SQL Server will prevent the drop/rebuild of the synonym if someone is still using it. THAT I've tested for! It will sit and patiently wait to get exclusive use.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • CELKO (3/23/2015)


    I cover some to this stuff in my TREES IN SQL book. Which will have this thread's material when they ask me to update it. Hey. Imitation is the sincerest form of plagiarism :Whistling:

    No problem. Just send me lots of real USD$ money when you do. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Since you seem hell bent on stealing it for profit, I'll make you a deal. 😉

    Since these articles are copyrighted, it would seem that a partial co-author cut of the profits (I'd have to give Adam a cut out of that) and an honorable mention on the cover as a co-author would be in order. 😉 Whatcha figure for percentage for full use of text, graphics, and code from this thread and the Hierarchies on Steroids #1 thread for that book only and the promise not to buy a black vest? 😀 I'd settle for pages of my stuff/total substantial chapter pages in the book (doesn't include things like the TOC, Index, Glossary, References, Front Matter, Blank Pages, etc) as a percentage provided there was no cost to me.

    I won't keep any of the money, though. If Adam agrees, I'd have the money sent to one of my favorite charities and the publishing company could use that as an incentive to have folks buy more of the books.

    How's that sound? Deal?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • My interest in this discussion thread is peaking now.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Okay, so I'm sitting in my cube today contemplating my typical Adjacency List employees table and of course the equally typical recursive CTE contraption I use to query it hither and yon and think: "There has GOT to be a better way to do this." We only have about 40 or so employees, so speed isn't really a concern but still...

    As you can imagine, the Google search leads me to various web sites discussing Nested Sets and Modified Preorder Tree traversals among other things. Nested Sets are a cool concept but seemed to me (and if the inimitable Mr. Celko is lurking hereabouts, this is an observation, not a criticism) to be shifting the complexity from the querying of the data to the updating/maintenance of the data. But I figured since this was data that gets queried far more than updated the trade-off is probably worth it.

    I'm considering posting a question on SSC about the subject when I stumble across this article on the MSDN blogs about converting Adjacency Lists to Nested Sets by this guy rambling on about RBAR and Tally Tables and I think "Whoever this is must hang out on SSC."

    Then I scrolled up and saw the author. Should have known.

    Still wrapping my head around some of it, but I can see already this can do more than just improve my Employees table. I like the fact that it takes a really good idea and builds on it without the all-too-common "Technique X is evil!" tripe encountered on the Internet.

    Excellent job on both articles.

    ____________
    Just my $0.02 from over here in the cheap seats of the peanut gallery - please adjust for inflation and/or your local currency.

  • Thanks, lshanahan. That's one of the nicer compliments I've ever had on the subject. I appreciate it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff. Excellent article and techniques.

    One question: What would be the best way to use this when we expect CRUD operations on the Employee list. Take a typical app that calls, say, a stored proc that adds an employee to the list - should we re-generate everything on every little chatty call like Add/Delete or is there something more efficient.

    I am looking for the technique that would best suit this use case.

Viewing 15 posts - 31 through 45 (of 50 total)

You must be logged in to reply to this topic. Login to reply