rCTE vs LIKE for Hierarchy

  • Duration is usually what I shoot for because that's all end-users can perceive or actually care about (other than accuracy, of course). I'll even use "extra resources" to improve the end-user experience, if need be. There are other times where I look down the road and see that something should not use extra-resources because of future "load" o the machine even if it does mean better performance in the short term. "It Depends".

    For the IO, as much as I hate to admit it, "It Depends". There's physical IO (disk) and there's logical IO (basically, some form of memory in most cases). As you have found, there's usually a direct correlation between the number of reads something has and what its performance is. That's not always the case (Paul White did a brilliant job on a very high performance Triangular Join, in the past). Still, I also try to limit reads (especially if a large number of reads is involved) because they're either using the disk system or the memory system.

    For why rCTEs appear to use so many more reads than other methods, I have my own suspicions but don't really want to know that much about the "guts" of it because it's not likely that I can change it. As Paul White once suggested, it may be something as simple as SQL Server considering each row read to being a "read" instead of a measure of how many pages were read.

    Thanks for your time Jeff that was the answer I was looking for, response time is the very reason i'm tunning this ! And whilst the performance with the rCTE solved the response time issues I just could not live with the doubt of where all that IO was coming from !

    Ariel Nessi (9/11/2012)

    --------------------------------------------------------------------------------

    1146 rows, which should not be a problem, but some procedures take up to 4 minutes to run with the like code, and that's pretty bad, my concern is that the rCTE code generates something like 100 table scans with 10k+ logical reads, and I'm a begginer to this but I don't think this much IO is ok, the LIKE code posted does something like 1 scan 68 logical reads, yet performs timewise much much worse and that's where i get lost

    I believe I have a method that will blow the doors off things like that. What are your more lengthy runs doing? For example are you trying to calculate the "rollup cost" of each node? ]

    Actually for those numbers of IO I'm just getting all childs and sub-childs in the hierarchy taking the uppermost registry on the hierarchy ( so it's pretty much like an select * from ) This piece of code is used to filter another table with a join on it.

    Works like this,

    User picks Locations it would like to search so when he picks, lets say, France, I'll drill down the hierarchy getting every district, street, city and so on and then join those IDs on another table.

    When I pick location "Universe" I get that IO

    For all that matters my question was answered, IO is not the holy grail (I all cases).

  • Jeff Moden (9/14/2012)


    GSquared (9/13/2012)


    ...

    Tests that don't involve any table-query, just Insert statements, or a RAM-resident Select (like the cCTE), don't test all the factors in building an adjacency hierarchy crawl in a While loop.

    Here's a test of that kind of thing:

    ...

    I created a test table a while back with a set of complex hierarchies in it. The one that starts with ID 1, used in this test, is 427 levels deep. So the rCTE needs the MaxRecursion option in order to crawl the whole thing. That slows it down, of course.

    In my tests (core i7 Quad, 16 Gig RAM, Windows 7 64-bit, running SQL 2008 R2 Dev Edition), the rCTE averages 6 milliseconds for the whole crawl, including the DDL for the Select Into (used to eliminate data-rendering time). The While loop averages 2296 milliseconds.

    ....

    How many rows in that hierarchy and is there any chance of seeinng the similar DDL for the table including the indexing directly related to your queries? If you could provide what the average and max fan-out is for your tests, that would help. I'd like to do some additional testing on this and that information might help me in how I design the typical million row test table on something like this. Thanks, Gus.

    Table structure is the one from my recent hierarchies article.

    Number of rows and complexity of hierarchies in it vary depending on what I'm testing. This test was 427 levels deep with each level being 1 node. Built that set just to test how deep HierarchyID could go, and didn't bother changing it to test this aspect.

    I've tested more complex hierarchies While vs rCTE, and the results have always been comparable, with the While Loop being much slower. The slow-down is because of having to filter each run to not rerun what's been pulled in prior runs. There are ways to work around that, but you end up with insanely complex code, and it's still slower than a simple rCTE for an adjacency crawl in every test I've run on it. (Worst workaround I ever saw used a recursive UDF and nested cursors in order to avoid Where Not In. Ouch! Talk about slow!)

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • On the I/O, the high reads+scans on rCTEs are because it has to check each row that comes up for children rows.

    Example (simple) hierarchy:

    1 null

    2 1

    3 1

    4 2

    5 2

    6 3

    7 3

    If you crawl starting at 1, it has to then scan the index to find all rows where ParentID = 1. So, 1 seek (shows as a scan in I/O stats) to find "Where ID = 1", then 1 scan to find the children of that node. Then it has to take ID = 2 and scan to find all children of 2, and then the same for all children of 3. So 2 scans for that level. We're already looking at 5 scans and 7 reads, and this hierarchy is 3 levels deep and has 7 total nodes. It gets worse the wider the hierarchy is and the deeper it is.

    Nested Sets

    1 12

    2 7

    3 4

    5 6

    8 11

    9 9

    10 10

    That's the same hierarchy (1 top node, with 2 immediate children, each of which has 2 children). I/O will be 1 scan. If the hierarchy is the clustered index (normally is), the speed is nearly instanteous on these. I've seen a thousand-node hierarchy, 70+ levels deep, with variable width and complexity on the down-paths, take 2 milliseconds to query and return the data. Took longer to pipe it to the web server through Gigabit ethernet than it did to pull it from the SAN. Even without caching.

    Plus, nested sets doesn't allow for infinite-loop hierarchies. Adjacency, especially multi-parent adjacency, can create that problem. Nested sets also doesn't allow for broken hierarchies. Delete the wrong node and don't correct for it, and an adjacency hierarchy breaks. In the example above, if you delete node 3, nodes 6 and 7 are now orphans. A good trigger can help prevent that, but you can't always count on that covering everything you need. Delete node 2 7 in the nested sets, and 3 4 and 5 6 are not orphans, they just become immediate children of node 1 12.

    Of course, if you need to move a whole branch around, that's dead easy in adjacency and a pain in the *** in nested sets. Easy but slower with HierarchyID.

    Check my articles for details.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 3 posts - 31 through 32 (of 32 total)

You must be logged in to reply to this topic. Login to reply