HierarchyID performance problems... Really???

Question

HierarchyID performance problems... Really???

Jeff Moden

SSC Guru

Points: 1003867
More actions
October 2, 2011 at 11:09 am

#259406

I heard tales of performance problems associated with the HierarchyID datatype in SQL Server 2008. Since I haven't actually used the HierarchyID datatype (and, therefore, haven't used any of it methods up 'till now), I built myself a nice "little" million row "clean" table in the form of an Adjacency List (including an extra column called "Sales") and converted it to use the HierarchyID data-type. I used the indexes that Microsoft Suggested for such a thing.
Then, I built several queries just to try my hand at it (tried them on a smaller 14 node Hierarchy first, just to make sure things worked correctly). All of these queries are based on what I've been able to Google as "hierarchyid performance problems".
1. Find all Descendants (all employees in the "down-line" tree) for a given node.
2. Find all Ancestors (all managers in the "up-line" chain) for a given node.
3. Find all Siblings (all nodes at the same level) for a given node.
4. Find the SUM of sales for all Descendants 7 levels "down" for each of the million nodes in the entire hierarchy and insert into a new table.
Although I'm sure I could tweak a couple of things here and there insofar as indexing goes, I'm just not seeing what I would call either a performance problem or a resource usage problem.
So, my questions are... has anyone actually experienced a performance problem using the HierarchyID data-type and associated methods? If so, could you describe what you were doing and, perhaps, even post some code that demonstrates the problem?
Thanks for the help, folks.
--Jeff Moden
RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
Change is inevitable... Change for the better is not.
Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 1

Jeff Moden (10/2/2011)
I.... Since I haven't actually used the HierarchyID datatype (and, therefore, haven't used any of it methods up 'till now), I built myself a nice "little" million row "clean" table in the form of an Adjacency List (including an extra column called "Sales") and converted it to use the HierarchyID data-type. I used the indexes that Microsoft Suggested for such a thing.

Hi Jeff

I have not used this data type before...but am keen to learn more.

would you care to share your "little" million row test rig please?

regards jls

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Jeff Moden SSC Guru Points: 1003867 More actions · Answer 2

Sure... the following code creates a million row "clean" hierarchy in the form of a classic Adjacency List. I have it as a stored procedure so that I can just call it with the number of rows I want the hierarchy to contain for different experiments...

CREATE PROCEDURE dbo.BuildTestHierarchy

/**********************************************************************************************************************

Create a randomized "clean" hierarchy. Each EmployeeID (except the first one, of course) is assigned a random

ManagerID number which is always less than the current EmployeeID. This code runs nasty fast and is great for

testing hierarchical processing code.

Usage: (both examples build a million row Adjacency List Hierarchy)

EXEC dbo.BuildTestHierarchy 1000000

Revision History:

Rev 00 - 28 Apr 2010 - Jeff Moden - Initial creation and test.

Rev 01 - 15 May 2010 - Jeff Moden - Abort if current DB isn't "tempdb" to protect users that want to "play".

**********************************************************************************************************************/

--===== Declare the I/O parameters

@pRowsToBuild INT

AS

--===== Make sure that we're in a safe place to run this...

IF DB_NAME() <> N'tempdb'

BEGIN

RAISERROR('Current DB is NOT tempdb. Run aborted.',11,1);

RETURN;

END

;

--===== Conditionaly drop the test table so we can do reruns more easily

IF OBJECT_ID('TempDB.dbo.Employee','U') IS NOT NULL

DROP TABLE TempDB.dbo.Employee

;

--===== Build the test table and populate it on the fly.

-- Everything except ManagerID is populated here.

SELECT TOP (@pRowsToBuild)

ISNULL(ROW_NUMBER() OVER (ORDER BY (SELECT 1)),0) AS EmployeeID,

CAST(0 AS INT) AS ManagerID,

CAST(NEWID() AS VARCHAR(36)) AS EmployeeName,

(ABS(CHECKSUM(NEWID()))%12+1)*1000 AS Sales

INTO TempDB.dbo.Employee

FROM Master.sys.All_Columns ac1

CROSS JOIN Master.sys.All_Columns ac2

;

--===== Update the test table with ManagerID's. The ManagerID is some random value which is always less than the

-- current EmployeeID to keep the hierarchy "clean" and free from "loop backs".

UPDATE TempDB.dbo.Employee

SET ManagerID = CASE

WHEN EmployeeID > 1

THEN ABS(CHECKSUM(NEWID())) % (EmployeeID-1) +1

ELSE NULL

END

;

--===== Add some indexes that most folks would like have on such a table

ALTER TABLE TempDB.dbo.Employee ADD CONSTRAINT PK_Employee PRIMARY KEY CLUSTERED (EmployeeID);

CREATE INDEX IX_Employee_ManagerID ON TempDB.dbo.Employee (ManagerID);

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

matzer SSCrazy Eights Points: 9727 More actions · Answer 3

You posted a possible answer to your own question more than a year ago, so maybe you forgot.

http://www.sqlservercentral.com/Forums/FindPost922664.aspx

Unfortunately, the link to connect.microsoft.com results in an unexpected system error now. Maybe you remember what the item was about.

Jeff Moden SSC Guru Points: 1003867 More actions · Answer 4

Thanks, Peter. Ironically, that was one of the "dead ends" I ran across in my search for performance problems and I'm definitely not seeing those types of problems in the testing I've set up, so far.

No... I don't remember what that particular Connect Item was about, either. MS is correct about CLRs not being "seen" by the Optimizer and that the problem they mentioned could definitely be a problem with any CLR, but I'm just not seeing such problems with HierarchyID code since I started playing with it yesterday. Maybe they fixed it so it wouldn't be such a problem anymore.

Anyway, if anyone has an example of were the use of the HierarchyID data-type and related methods is a performance problem, I'd sure like to see it.

Thanks again.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003867 More actions · Answer 5

Gosh folks... based on the low number of responses on this thead, either not a whole lot of people are using the HierarchyID or not a whole lot of people are having problems with its performance.

Maybe I should change the question a bit... are any of you using the HierarchyID data type?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 6

Jeff Moden (10/3/2011)
Maybe I should change the question a bit... are any of you using the HierarchyID data type?

No.

maybe useful in the future when dealing with 3rd pty dbs.

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

Jeff Moden SSC Guru Points: 1003867 More actions · Answer 7

Heh... I figured that out from your post where you said you never used it before. 🙂 I was hoping some of the other denizens of SSC would jump in. Folks are normally pretty talkative on SSC and I'm amazed at the virtual silence this subject has drawn.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Sean Lange SSC Guru Points: 286573 More actions · Answer 8

I think that may be further proof that few people has utilized this...nobody want to speak on a topic that they don't know about. I am hoping I can find some time soon to look through the test you have so far. I know I am going to have a project in the next few months where this could be really useful. In my case I would need more of a forest (multiple roots) but I may be able to use the hierarchyID as part of the solution.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

J Livingston SQL SSC Guru Points: 51550 More actions · Answer 9

Jeff Moden (10/4/2011)
Heh... I figured that out from your post where you said you never used it before. 🙂 I was hoping some of the other denizens of SSC would jump in. Folks are normally pretty talkative on SSC and I'm amazed at the virtual silence this subject has drawn.

having read up a bit more on the subject...I can see possible benefits for such things as BOMs.

Would be interested to see if anyone out there is using it for BOMs.

At the moment I have no requirement for BOMS or complex Org charts ....so for now, its interesting but just not applicable (I think !!)

________________________________________________________________
you can lead a user to data....but you cannot make them think
and remember....every day is a school day

CGSJohnson SSCertifiable Points: 7208 More actions · Answer 10

We are now just evaluating its use in our shop. We would not have big data to work with though.

Thanks...Chris

Jeff Moden SSC Guru Points: 1003867 More actions · Answer 11

Sean Lange (10/4/2011)
I think that may be further proof that few people has utilized this...nobody want to speak on a topic that they don't know about. I am hoping I can find some time soon to look through the test you have so far. I know I am going to have a project in the next few months where this could be really useful. In my case I would need more of a forest (multiple roots) but I may be able to use the hierarchyID as part of the solution.

That's exactly what I'm thinking, Sean.

Shifting gears, I've done some rather extensive work with "forests of trees" using Nested Sets. I can't say what we're using it for because it's a proprietary process that might be patentable. If you can say, what would you be using it for? I'm asking because I might be able to help with some of the information I've developed for the article I'm slowly putting together on the subject of Hierarchical queries.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003867 More actions · Answer 12

CGSJohnson (10/4/2011)
We are now just evaluating its use in our shop. We would not have big data to work with though.
Thanks...Chris

Thanks for the feedback, Chris. I realize it's a bit off subject for this thread but what would you end up using hierarchical structures for?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

CGSJohnson SSCertifiable Points: 7208 More actions · Answer 13

Hi, Jeff. Sorry, but I cannot talk about it all. However, one use would be for the common goal of organizational structure.

Thanks...Chris

Sean Lange SSC Guru Points: 286573 More actions · Answer 14

Jeff Moden (10/4/2011)
Sean Lange (10/4/2011)
I think that may be further proof that few people has utilized this...nobody want to speak on a topic that they don't know about. I am hoping I can find some time soon to look through the test you have so far. I know I am going to have a project in the next few months where this could be really useful. In my case I would need more of a forest (multiple roots) but I may be able to use the hierarchyID as part of the solution.
That's exactly what I'm thinking, Sean.
Shifting gears, I've done some rather extensive work with "forests of trees" using Nested Sets. I can't say what we're using it for because it's a proprietary process that might be patentable. If you can say, what would you be using it for? I'm asking because I might be able to help with some of the information I've developed for the article I'm slowly putting together on the subject of Hierarchical queries.

I have kind of wondered if the nested set would be easier/better for my situation. I am not doing it for anything proprietary by any stretch of the imagination. I am going to be completely rebuilding our menu system for one of our websites. The original knuckleheads that "designed" should be shot. I am thinking that the forest of trees would be perfect for this. It will not likely get very "deep" but the second and third levels can get fairly large as this is a ecomm site for about a half million skus. I can imagine how cool it would be if I could move whole "branches" from one tree to another, or at least copy them from one to the other.

I could of course do this pretty simple with an adjaceny list but what fun is recreating a technique that is already way out of date when there are better ways of accomplishing this?

I would be thrilled to help in anyway that I can.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/