Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Two Useful Hierarchy Functions


Two Useful Hierarchy Functions

Author
Message
John Cawley III
John Cawley III
Forum Newbie
Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)Forum Newbie (2 reputation)

Group: General Forum Members
Points: 2 Visits: 29
Comments posted to this topic are about the item Two Useful Hierarchy Functions
dsilveira
dsilveira
Grasshopper
Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)

Group: General Forum Members
Points: 18 Visits: 5
very nicely done!
jeb bushell
jeb bushell
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 9
Hard to read due to poor web presentation. I'm using IE7 and the code segments are not readable unless they're cut and pasted elsewhere. I ran out of time and patience.
ddnrg
ddnrg
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 23
Does the new SQL2008 HeirarchyId pretty much replace the need for doing this?
Chris Muller-398403
Chris Muller-398403
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 10
Nice example, thanks for simplifying this complex subject. Would it have been better to make the node_id column in the table example an identity column for saving data (after a user selected one item) and reporting? What are the costs/benefits?

Also, I guess the hierarchy could change too, what if there was an org. change in that example, where Alan left and another became the start of the tree? In my application, do we want to save the pick of Alan in your hierarchy (hence that identity column idea above) or just the current starting node for the year, no matter who it is at that point in time? I guess that may depend on your application needs?

Thanks again for the nice article and great subject.
GSquared
GSquared
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14375 Visits: 9729
I've used hierarchies with as many as 7,000 nodes and as many as 50 levels. CTEs work well for those.

But one thing that should be brought up in any discussion of hierarchies is the "nested sets" model.

The type of hierarchy defined in this article is called an "adjacency hierarchy". A nested sets hierarchy works a little different.

Here's a sample table for a nested sets hierarchy:


create table NSHierarchy (
SetStart int not null,
SetEnd int not null,
constraint PK_NSHierarchy primary key (SetStart, SetEnd),
Name varchar(100))



The way it works is that the SetStart column and the SetEnd column define a range of values, and lower levels of the hierarchy are given values between those.

For example:


insert into dbo.NSHierarchy (SetStart, SetEnd, Name)
select 1, 10, 'Mandy' union all
select 2, 5, 'Bob' union all
select 6, 9, 'Sue' union all
select 3, 3, 'Doug' union all
select 4, 4, 'Sam' union all
select 7, 7, 'Terry' union all
select 8, 8, 'Sal'



In this example, Mandy is the top level of the hierarchy. Bob and Sue report directly to Mandy, since their numbers are inside her range (anything between 1 and 10 is in her hierarchy). Doug and Sam report to Bob (their numbers are between 2 and 5, which are Bob's numbers), while Sam, Terry and Sal report to Sue (between 6 and 9).

This is queried as:


create proc HierarchySelect
(@TopNode_in int)
as
declare @SEnd int

select @SEnd = SetEnd
from dbo.NSHierarchy
where SetStart = @TopNode_in

select Name
from dbo.NSHierarchy
where SetStart between @TopNode_in and @SEnd
and SetEnd between @TopNode_in and @SEnd



(Variations on this to deal with the name as an input parameter, or to allow a level to have two parents (SetStart in one range, SetEnd in another), etc., are pretty easy to extrapolate.)

Joe Celko has a much more complete and comprehensive description of this on his site and in his books. I won't try to go into that level of detail, because he's already done a better job of it than I will.

The advantage to this is that it's MUCH faster than an adjacency hierarchy.

For example, that 7,000-node, 50-level hierarchy, takes almost a second to run (on my test machine) using a CTE, exactly as described in the article. Not bad, but it also takes 7,000 scans (IO). That's a LOT of IO. With a nested sets hierarchy, it takes less than a millisecond to run, and takes 2 scans.

The disadvantage of a nested sets hierarchy is that it takes more work to update/insert than an adjacency hierarchy. If the data changes a lot, it becomes much more difficult to manage.

I've solved that for my situation by using a hybrid solution (partially suggested by Jeff Moden). I have tables that looks like this:


create table Hierarchies (
HierarchyID int identity primary key,
Name varchar(100))
go
create table HierarchiesNodes (
HierarchyID int not null references dbo.Hierarchies(HierarchyID),
NodeID int identity primary key,
ParentID int null references dbo.Hierarchy(NodeID),
SetStart int null,
SetEnd int null,
SetLevel int null,
CustomerID int not null references dbo.Customers(CustomerID),
ActiveDate datetime default(getdate()),
InactiveDate datetime null)



The update, insert and delete code for the nodes table sets the SetStart and SetEnd columns to null for any affected rows. The lookup code (select), checks to see if any level that's in the same hierarchy (HierarchyID) has a null SetStart or SetEnd. If so, it uses a CTE to pull the hierarchy. If not, it uses a nested sets query to pull it. Every hour, the table is scanned for null SetStart and SetEnd values, and if any are found, those hierarchies have their set data updated. Again, Joe Celko has an article on how to do that efficiently (it's not hard to figure out if you know how to use a CTE to resolve the hierarchy in the first place).

I use the HierarchyID column as the first column in the indexes on this table. It's selective enough, in my case, that it seriously speeds up the selects. It doesn't matter for the set queries, but it does for the CTEs.

I use the SetLevel column for the nested sets queries, because unlike a recursive CTE, I can't just have it use a "Depth + 1" calculation if I want levels. It also allows me to limit the number of levels, if that's appropriate for the query at hand. Just like the HierarchyID doesn't matter to the sets, the SetLevel column doesn't matter for the CTE, but I need both for this hybrid solution.

Thus far, this gives us the best of both models.

I thought it should be mentioned, since this data matters if you deal with hierarchies much.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
GSquared
GSquared
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14375 Visits: 9729
ddnrg (5/21/2008)
Does the new SQL2008 HeirarchyId pretty much replace the need for doing this?


That one, per what I've read, is fast to query, but slow to update/insert/delete. Adjacency hierarchies (like the one described in the article) are fast to update/insert/delete and somewhat slower to query. So it might, or might not, replace it, depending on specific needs.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45084 Visits: 39912
Nice and simple, pleasure to read. Well done, John! Smile

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Anipaul
Anipaul
SSCertifiable
SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)

Group: General Forum Members
Points: 6275 Visits: 1407
Very well written..........



Omama Hatari
Omama Hatari
Grasshopper
Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)

Group: General Forum Members
Points: 10 Visits: 63
Got error "Invalid object name 'salaries'."
Table is not created anywhere
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search