Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Everybody Reports to Somebody Expand / Collapse
Author
Message
Posted Monday, June 4, 2007 9:21 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, September 24, 2014 1:20 PM
Points: 1,276, Visits: 1,135

Hi David and Craig,

The LeftID and RightID define a subset.  In the case of John, his LeftID=1 and RightID=24.  Anyone with a LeftID and RightID between these two numbers are defined by a subset of John's set.  Bob's has the LeftID=2 and RightID=15, so his numbers are a subset of John's set {1..24} (apologies for the "fake" set notation .)  Therefore, Bob reports to John.  And so on...

Actually you don't need to use the adjacency list to renumber the left and right ID's of your nested sets.  There are some other options to improve efficiency.  In the example I gave the sets were a "tight" fit, with the left and right id numbers as close together as possible.  However, that's not a requirement.  You can actually spread them out further to start with, so long as they still properly "nest" within one another.  For instance:

John Smith, LeftID=1, RightID=1000

Bob Johnson, LeftID=20, RightID=150

Sue Fields, LeftID=300, RightID=400


In this case, Bob and Sue are still nested within John Smith, but the range between LeftID and RightID has been widened.  You can easily insert or delete individuals without recalculating (dependent on business rules).  The recalculations come in when Bob's entire range is filled and he gets another subordinate.

Using this method you don't need to recalculate LeftID and RightID as often, which, as David pointed out can hurt performance on large data sets.  Renumbering can also be done with just a few statements (albeit slightly more complex ones).  This method also has the advantage that you can calculate things like the level of each employee in the hierarchy, total height of the tree, etc., relatively easily.

You can also convert adjacency list to nested sets, and vice versa.

Celko's written a lot on the topic of nested sets versus adjacency list.  He also has a pretty good book on the subject - "Trees and Hierarchies in SQL".  Although it's not SQL Server specific, his examples are fairly easy to port over.

Also, if you're going to use the adjacency list, you might consider using recursive CTEs (SQL 2K5) - they were defined for this type of thing

Post #371039
Posted Monday, June 4, 2007 9:31 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, October 29, 2013 5:29 AM
Points: 104, Visits: 721

"In practice, does the data in the set correspond to an organisation's departmental structure, rather than individual relationships"

This is just a different model for storing hierarchical data in SQL tables. The origial article presented an adjacent set model, Mike presented a nested sets model, and there's another I've seen used called the maternalized path model. The maternalized path model is basically the same structure as a URL, folders and paths. In that model you store a node's entire ancestry in a column. To get to your question though, with both the nested sets and maternalized path models the information you use to maintain the tree structure can be, and usually is, entirely fictional with no relationship whatsoever with the data it's associated with. Kind of like inventing an ID column for a table instead of using a person's name as a key for instance.

I haven't had to model hierarchical data for a project in a while but I used the nested sets model to do it when I did. It really isn't as difficult to use or maintain as some articles I've read about it indicate -- of course I'll suffix that by saying each model has its advantages and disadvantages. Changing someone's boss, for instance, is essentially moving a node within the tree. In the nested sets model this boils down to updating rows that are between numerical ranges and adding or subtracting numbers. With the maternalized path model you update rows that are LIKE a certain string. In either case you could potentially update every row in the table. The adjacent set model wouldn't update as many rows to do this.

Post #371045
Posted Monday, June 4, 2007 11:31 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Wednesday, September 24, 2014 1:20 PM
Points: 1,276, Visits: 1,135

Here's a short CTE, assuming a table named #Employees with the data shown in the article:

WITH OrgChart (UserID, LastName, FirstName, ManagerUserID)
AS
(
    SELECT UserID, LastName, FirstName, ManagerUserID
    FROM #Employees
    WHERE ManagerUserID = 0
    UNION ALL
    SELECT e.UserID, e.LastName, e.FirstName, e.ManagerUserID
    FROM #Employees e
    INNER JOIN OrgChart o
    ON e.ManagerUserID = o.UserID
)
SELECT UserID, LastName, FirstName, ManagerUserID
FROM OrgChart;

Please excuse any typos - I had to retype this in here.

Post #371094
Posted Monday, June 4, 2007 12:25 PM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Thursday, December 18, 2014 6:32 AM
Points: 1,890, Visits: 3,472
"I was hoping it would appear in SS2005..."

SQL Server 2005 introduced Common Table Expressions (CTE) which can be recursive.
Post #371103
Posted Friday, May 16, 2008 8:13 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, September 8, 2014 8:18 AM
Points: 26, Visits: 265
I like Craig's original approach because performance is good and it is straightforward. Another important consideration is many, if not most organizations that I am aware of tend to follow the 'one-employee-one-manager' model. I utilize essentially the same algorithm with three differences: 1) I encapsulate all of the code within the stored procedure, which simply outputs an ordered four-column dataset (SortName [Last, First], Employee ID, SupervisorID, Level), 2) I include the Level column (integer), which indicates how many levels down from the specified manager each employee is in the hierarchy, and 3) I have an additional input parameter for Level, which allows me to easily return only the direct reports for the manager (or some other number of levels deep that I may be interested in at the moment). Of course, my method makes certain assumptions about how the names will be used since I am controlling the name format (we do a lot of dropdown lists and reports using the output.)
Post #502094
Posted Friday, May 16, 2008 12:13 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, May 4, 2010 11:33 AM
Points: 43, Visits: 102
Being familiar with both the nested set model [specifically Celko's push stack] and the CTE model, they each have their advantages and are relatively easy to set up. I recommend the adjacency list for OLTP systems with a sproc to generate the nested set as hierarchical data is needed (it's a cheap operation).

Another option that I feel should at least be mentioned is denormalizing your data, although it limits the number of levels you can have in your hierarchy. Denormalization, however, would be the fastest hybrid OLTP/OLAP solution (nested sets IMO are functional OLAP).

- James
Post #502279
Posted Sunday, August 8, 2010 8:24 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:22 AM
Points: 35,772, Visits: 32,443
James Raddock (5/16/2008)
...with a sproc to generate the nested set as hierarchical data is needed (it's a cheap operation).


I know this is an old thread, but can you post the code for that sproc? Thanks.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #965710
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse