Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Hierarchies on Steroids #1: Convert an Adjacency List to Nested Sets Expand / Collapse
Author
Message
Posted Tuesday, November 13, 2012 7:00 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 10:09 PM
Points: 38,338, Visits: 35,255
Jason-299789 (11/13/2012)
A very nice article jeff, Its given me a few Ideas on how to change one of the Hierarchy builders I have, just need to find the down time to test it.

Looking forward to future installments.



Thank you for the feedback, Jason. The article mentioned in the "p.s." comes out on Thursday, 15 November ( 2 days from now).

I'd be interested in knowing what your ideas are, if you have the time.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1384068
Posted Tuesday, November 13, 2012 7:04 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 10:09 PM
Points: 38,338, Visits: 35,255
mister.magoo (11/13/2012)
Jeff Moden (11/12/2012)
Just like the old cartoon, I've got to say... "Ah, Magoo! You've done it again!"

Here are the run results using the same machine I used in testing for the article for a million nodes using your fine addition.

Building the initial table and SortPath...
There are 1000000 rows in dbo.Hierarchy
Cumulative Duration = 00:00:14:420
Building the Nested Sets...
1000000 rows have been updated to Nested Sets
If the rowcounts don't match, there may be orphans.
Cumulative Duration = 00:00:42:023
Building the indexes...
Cumulative Duration = 00:00:47:330
===============================================
RUN COMPLETE
===============================================



In whole numbers, that a rock solid 13% improvement. Well done, Magoo!

My only question now is, is it the "Identity Hack" that did it or is it the fact that you used a While Loop to replacce the rCTE (and we've previously proven that such loops will frequenctly beat rCtEs). I'll give it a try tomorrow.

Speaking of that, tomorrow is only a half hour a way so I need to hit the hay or the people at work are going to have to put up with a really cranky ol' DBA tomorrow. Thanks again for the code, Magoo.

To the best of my knowledge, the identity has no effect on speed directly, but it allows us to "remember" a value between statements without using a variable, which in turn means the while loop can operate with just one statement... That is where the advantage is gained.
If you have to perform operations in the while loop to manipulate a variable it becomes costly.

The result of doing the inserts in this way is a series of set based inserts with no extra overhead.


Ah... I missed that completely last night. Thanks, Magoo. I've been meaning to do a deeper dive on your "Identity Hack" since the first time you brought it up on the forums. I definitely need to take a closer look.

As a side bar, you should write a "spackle" article on the subject. As my boss wwould say, it's a "spec-hacular" trick.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1384072
Posted Tuesday, November 13, 2012 7:22 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: 2 days ago @ 12:08 PM
Points: 53, Visits: 606
Stick???

I'd say this article is more like a bat; and a really good tool to hammer this example into the minds of some data model dudez i need work with to develop an SNMP MIB database :)

Thanks for sharing !!!

Theo
Post #1384083
Posted Tuesday, November 13, 2012 8:06 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 10:09 PM
Points: 38,338, Visits: 35,255
Theo Ekelmans (11/13/2012)
Stick???

I'd say this article is more like a bat; and a really good tool to hammer this example into the minds of some data model dudez i need work with to develop an SNMP MIB database :)

Thanks for sharing !!!

Theo


Heh... if you like what can be done with a bat, then you'll really like the 16# sledge hammer you'll find in the followup article on hierarchies coming out this Thursday (15 Nov 2012), Theo. It shows another bit of "black arts" magic that preaggregates the million node hierarchy to provide most of the answers to questions that you might ever ask of a hierarchy and it only takes 8 seconds longer.

Thanks for the feedback, Theo. I'd also be interested in a more detailed version of what you're doing for your SNMP MIB database. It sounds like a great deal of fun!


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1384116
Posted Tuesday, November 13, 2012 9:12 AM


Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: 2 days ago @ 12:08 PM
Points: 53, Visits: 606
LOL, gimme thor class hammer!!!

The project i'm working on is a roof of concept SNMP MIB database, that is so fast it can do (near)realtime SNMP data packet to SQL data conversion

Just in case your new to the extensible SNMP MIB tree:

If E-mail addresses were OIDs... user@ordina.org would have been something like: user@ordina.enterprises.private.internet.dod.org.iso
or in non-translated notation: user@99999.1.4.1.6.3.1

except that we write the top-most part at the left: 1.3.6.1.4.1.99999.117.115.101.114

An OID is just a unique key (within one managed device) for one piece of information Ensures vendors don't have conflicting OIDs

OID corresponds to a label .1.3.6.1.2.1.1.5 => sysName (in windows this is the servername)

The complete (translated) path: .iso.org.dod.internet.mgmt.mib-2.system.sysName

The info needed to get from one to an other is "mib (database) files", that are supplied with each new device / software version.

The provided MIB files all have a part of the ginormous jigsaw MIB tree, and that needs to be searched between 500 to 20.000 times a minute.

Fun project indeed :)
Post #1384146
Posted Tuesday, November 13, 2012 11:36 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, August 28, 2015 10:36 AM
Points: 1,406, Visits: 876
Excellent article, Jeff! I look forward to Thursday's installment.

I've played around with mixing these two representations of hierarchies before, but had never gotten past the performance issues, so it's great to have a new approach to try. It seems that I keep inheriting db implementations that contain at least one hierarchy in them, so having more choices for dealing with them is a very good thing!

For those that are willing to give up a little in terms of understandability (which can be dealt with via abstractions, anyway), and who are working with fairly constrained hierarchy sizes, take some time to investigate the matrix path encoding technique that Tropashko describes. You can think of it as a "best of both worlds" approach to the problem that doesn't have the heavy recalculation requirements. I was also able to obtain favorable performance results for many of the descendant- and ancestor-type queries. If nothing else, it will expand your db math skillset

And finally, for those with some disk space to burn, you might consider materializing the transitive closure of the hierarchy. This can really net you some performance if you have a large hierarchy that is used primarily for reads (the materialization is a rather expensive process).

-TroyK



Post #1384219
Posted Tuesday, November 13, 2012 2:31 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Thursday, August 13, 2015 3:23 PM
Points: 3,427, Visits: 3,696
Hi Jeff,
Simply brilliant!

Thanks
IgorMi




Igor Micev,
SQL Server developer at Seavus
www.seavus.com
Post #1384303
Posted Tuesday, November 13, 2012 2:56 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Today @ 9:22 AM
Points: 567, Visits: 2,363
Jeff,

In the code for generating the tally table, you use this syntax:
row_number() over (order by (select null))

I have never see this before. But it works. What is this '(select null)' trick all about?

PS
Very, very neat article!
Post #1384320
Posted Tuesday, November 13, 2012 3:56 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, August 19, 2015 4:43 AM
Points: 65, Visits: 876
Michael Meierruth (11/13/2012)

What is this '(select null)' trick all about?


It's a trick to 'order' by constant value, i.e. to not order/sort at all; in this case it adds a row number number based on the natural order that the processor hit the rows, rather than pre-sorting by a given column(set). You can't just do OVER(ORDER BY 1), because it interprets it as attempting to use the first column in the select, which OVER() won't let you do, but it does seem to let you get away with OVER(ORDER BY (SELECT 1)) [or, in this case, OVER(ORDER BY (SELECT NULL))].

I can't see in the BOL syntax tree where a select is possible (bar some vague reference to an 'expression') but it's been stable enough for me in the past. Maybe someone with a keener understanding of the nuances of the BOL can spot the bit that makes it a 'documented feature'.

In any case, I have often wondered if it might be equivalent to `OVER (ORDER BY the_clustered_index_col)` (at least for a plain table), or whether that would incur more reads. I have never bothered to try because OVER(ORDER BY (SELECT NULL)) always worked well enough for me.
Post #1384334
Posted Tuesday, November 13, 2012 7:32 PM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Yesterday @ 3:31 PM
Points: 19,175, Visits: 17,474
Nice job Jeff.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server, MVP


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Post #1384369
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse