Find hierarchyID from different parts of a node?

Question

Find hierarchyID from different parts of a node?

Y.B.

SSChampion

Points: 11652
More actions
March 14, 2016 at 12:24 pm

#399858

I started working with the hierarchyID data type. So far it's been pretty great working with the built-in functions rather than having to use recursive CTE's.
Consider the following:
DECLARE @myTable TABLE (ID hierarchyID, Name VARCHAR(50))
INSERT INTO @myTable
VALUES ('/1/', 'Publication1'), ('/1/6/', 'Technology'), ('/1/6/22/', 'Internet'), ('/1/6/1256/', 'Gaming'), ('/1/67/', 'Sports'), ('/1/67/1134/', 'Football'), ('/1/67/1135/', 'Local'), ('/1/103/', 'News'), ('/1/103/10657/', 'Local')
SELECT * FROM @myTable
--No problem when the Name is unique
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE Name = 'Football' AND ID.IsDescendantOf('/1/') = 1
--Problem when the Name is used more than once down a path
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE Name = 'Local' AND ID.IsDescendantOf('/1/') = 1
--Can do it this way but is there a better way than using LIKE?
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE ID.ToString() LIKE '%1135%' AND ID.IsDescendantOf('/1/') = 1
The data is imported in such a way that level 1 is basically different publications. Any node 'id' below that are completely unique even though they can be duplicated in separated publications. My problem is how do I find the hierarchyID based on descendants of a level 1 node that contains that node 'id' without using 'LIKE'. Maybe I'm just being OCD but I hate using LIKE with a wildcard at the beginning which will essentially make it non sargable.
Any thoughts?
SELECT quote FROM brain WHERE original = 1
0 rows returned

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Phil Parkin SSC Guru Points: 247228 More actions · Answer 1

yb751 (3/14/2016)
I started working with the hierarchyID data type. So far it's been pretty great working with the built-in functions rather than having to use recursive CTE's.
Consider the following:
DECLARE @myTable TABLE (ID hierarchyID, Name VARCHAR(50))
INSERT INTO @myTable
VALUES ('/1/', 'Publication1'), ('/1/6/', 'Technology'), ('/1/6/22/', 'Internet'), ('/1/6/1256/', 'Gaming'), ('/1/67/', 'Sports'), ('/1/67/1134/', 'Football'), ('/1/67/1135/', 'Local'), ('/1/103/', 'News'), ('/1/103/10657/', 'Local')
SELECT * FROM @myTable
--No problem when the Name is unique
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE Name = 'Football' AND ID.IsDescendantOf('/1/') = 1
--Problem when the Name is used more than once down a path
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE Name = 'Local' AND ID.IsDescendantOf('/1/') = 1
--Can do it this way but is there a better way than using LIKE?
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE ID.ToString() LIKE '%1135%' AND ID.IsDescendantOf('/1/') = 1
The data is imported in such a way that level 1 is basically different publications. Any node 'id' below that are completely unique even though they can be duplicated in separated publications. My problem is how do I find the hierarchyID based on descendants of a level 1 node that contains that node 'id' without using 'LIKE'. Maybe I'm just being OCD but I hate using LIKE with a wildcard at the beginning which will essentially make it non sargable.
Any thoughts?

Have you considered using the infamous DelimitedSplit8k and splitting on '/'?

Y.B. SSChampion Points: 11652 More actions · Answer 2

Phil Parkin (3/14/2016)
yb751 (3/14/2016)
I started working with the hierarchyID data type. So far it's been pretty great working with the built-in functions rather than having to use recursive CTE's.
Consider the following:
DECLARE @myTable TABLE (ID hierarchyID, Name VARCHAR(50))
INSERT INTO @myTable
VALUES ('/1/', 'Publication1'), ('/1/6/', 'Technology'), ('/1/6/22/', 'Internet'), ('/1/6/1256/', 'Gaming'), ('/1/67/', 'Sports'), ('/1/67/1134/', 'Football'), ('/1/67/1135/', 'Local'), ('/1/103/', 'News'), ('/1/103/10657/', 'Local')
SELECT * FROM @myTable
--No problem when the Name is unique
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE Name = 'Football' AND ID.IsDescendantOf('/1/') = 1
--Problem when the Name is used more than once down a path
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE Name = 'Local' AND ID.IsDescendantOf('/1/') = 1
--Can do it this way but is there a better way than using LIKE?
SELECT ID, ID.ToString() AS Path FROM @myTable WHERE ID.ToString() LIKE '%1135%' AND ID.IsDescendantOf('/1/') = 1
The data is imported in such a way that level 1 is basically different publications. Any node 'id' below that are completely unique even though they can be duplicated in separated publications. My problem is how do I find the hierarchyID based on descendants of a level 1 node that contains that node 'id' without using 'LIKE'. Maybe I'm just being OCD but I hate using LIKE with a wildcard at the beginning which will essentially make it non sargable.
Any thoughts?
Have you considered using the infamous DelimitedSplit8k and splitting on '/'?

OMG, it seems so obvious...lol :pinch:

Thanks Phil, I'll give that a go.

SELECT quote FROM brain WHERE original = 1
0 rows returned

Jeff Moden SSC Guru Points: 1004706 More actions · Answer 3

Shifting gears a bit...

...Consider the possibility of not using HierarchyID at all.

http://www.sqlservercentral.com/articles/Hierarchy/94040/

http://www.sqlservercentral.com/articles/T-SQL/94570/

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Y.B. SSChampion Points: 11652 More actions · Answer 4

Jeff Moden (3/14/2016)
Shifting gears a bit...
...Consider the possibility of not using HierarchyID at all.
http://www.sqlservercentral.com/articles/Hierarchy/94040/
http://www.sqlservercentral.com/articles/T-SQL/94570/

Interesting stuff Jeff...I've bookmarked those articles. A little more than I can handle before I've finished my morning coffee but I'll definitely get back to it. Why did I choose to use hierarchyid's? Well since we finally put 2005 in our rear view mirror I've been finally able to leverage some of the newer features and the data type just seemed to make sense to me from a design perspective. It didn't hurt that it was very easy to setup and understand. It has the added benefit of shrinking down the table to just two columns. The main use for this table is mostly to query all parents of a node. What can be simpler than?

DECLARE @sample hierarchyid = '/1/6/22/'

SELECT Name FROM @myTable WHERE @sample.IsDescendantOf(ID) = 1

This is not to discount any other methods; I'm just mentioning why I chose to go with that data type in the first place.

SELECT quote FROM brain WHERE original = 1
0 rows returned

Jeff Moden SSC Guru Points: 1004706 More actions · Answer 5

yb751 (3/15/2016)
Jeff Moden (3/14/2016)
Shifting gears a bit...
...Consider the possibility of not using HierarchyID at all.
http://www.sqlservercentral.com/articles/Hierarchy/94040/
http://www.sqlservercentral.com/articles/T-SQL/94570/
Interesting stuff Jeff...I've bookmarked those articles. A little more than I can handle before I've finished my morning coffee but I'll definitely get back to it. Why did I choose to use hierarchyid's? Well since we finally put 2005 in our rear view mirror I've been finally able to leverage some of the newer features and the data type just seemed to make sense to me from a design perspective. It didn't hurt that it was very easy to setup and understand. It has the added benefit of shrinking down the table to just two columns. The main use for this table is mostly to query all parents of a node. What can be simpler than?
DECLARE @sample hierarchyid = '/1/6/22/'
SELECT Name FROM @myTable WHERE @sample.IsDescendantOf(ID) = 1
This is not to discount any other methods; I'm just mentioning why I chose to go with that data type in the first place.

Understood. It just (according to many posts and some articles on the Internet) tends to be slow and, if something ever goes haywire with the structure, it will be difficult for a human to find and fix because each node is "aware" of many other nodes. Keeping it simple in the form of an "Adjacency List" will allow humans to much more easily fix the eventual problems with any hierarchical structure. The "hierarchical path/sort" provided as an interim step in the article will allow you to easily solve problems like this one you've posted using a simple SUBSTRING to get the data you want, and the creation of the "Nested Sets" will make for queries that will blow the doors off of most anything else. It will also "naturally" produce reports in the correct sorted order.

Of course, if you don't currently and won't ever have such needs (I'm thinking you might, though, based on your current question on this thread), then HierarchyID will do the thing you need for now.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Y.B. SSChampion Points: 11652 More actions · Answer 6

Jeff Moden (3/15/2016)
yb751 (3/15/2016)
Jeff Moden (3/14/2016)
Shifting gears a bit...
...Consider the possibility of not using HierarchyID at all.
http://www.sqlservercentral.com/articles/Hierarchy/94040/
http://www.sqlservercentral.com/articles/T-SQL/94570/
Interesting stuff Jeff...I've bookmarked those articles. A little more than I can handle before I've finished my morning coffee but I'll definitely get back to it. Why did I choose to use hierarchyid's? Well since we finally put 2005 in our rear view mirror I've been finally able to leverage some of the newer features and the data type just seemed to make sense to me from a design perspective. It didn't hurt that it was very easy to setup and understand. It has the added benefit of shrinking down the table to just two columns. The main use for this table is mostly to query all parents of a node. What can be simpler than?
DECLARE @sample hierarchyid = '/1/6/22/'
SELECT Name FROM @myTable WHERE @sample.IsDescendantOf(ID) = 1
This is not to discount any other methods; I'm just mentioning why I chose to go with that data type in the first place.
Understood. It just (according to many posts and some articles on the Internet) tends to be slow and, if something ever goes haywire with the structure, it will be difficult for a human to find and fix because each node is "aware" of many other nodes. Keeping it simple in the form of an "Adjacency List" will allow humans to much more easily fix the eventual problems with any hierarchical structure. The "hierarchical path/sort" provided as an interim step in the article will allow you to easily solve problems like this one you've posted using a simple SUBSTRING to get the data you want, and the creation of the "Nested Sets" will make for queries that will blow the doors off of most anything else. It will also "naturally" produce reports in the correct sorted order.
Of course, if you don't currently and won't ever have such needs (I'm thinking you might, though, based on your current question on this thread), then HierarchyID will do the thing you need for now.

Well I haven't completely set my mind on hierarchyid's just yet. I still have to do some testing to make sure it meets our requirement. I'm actually really grateful you pointed out a different method. It's never a bad thing to have options.

SELECT quote FROM brain WHERE original = 1
0 rows returned

Y.B. SSChampion Points: 11652 More actions · Answer 7

Ok, Jeff...you're starting to make a believer out of me. Hierarchyid's are all nice and dandy until you go ahead and move a node. Sure there is a built-in function to help move a node (GetReparentedValue) but it leaves it up to you fix the rest of the hierarchy after you move it. Child nodes essentially become orphaned. What I like about your method is how child nodes are preserved in the adjacency list portion if I were to change a parent of a node. I'd have to rebuild the sort path every time a change is made but I only anticipate the table having a couple thousand entries at most and would change so infrequently. I'll test my data with your method and let you know how it goes.

SELECT quote FROM brain WHERE original = 1
0 rows returned

Y.B. SSChampion Points: 11652 More actions · Answer 8

Sigh, so close!

Tested, Jeff's technique and it works great. Had to make some modifications to the code to work with my data but it wasn't too bad. Was able to quickly and easily pull parents of a child or vise versa with no issues.

But...and you know there had to be one.

Using this method implies you have unique ID's and I do for the most part; for each publication. Using hierarchyid this was easy as I could create a level just below the root to keep all of the publications unique and then populate the rest of the hierarchy below those nodes. This is because hierarchyid allows you to use the same 'id' multiple times as long as the path is unique.

I'm not completely giving up on Jeff's method just yet. I'll see if I can change the code that rebuilds the nested sets so that it can do it multiple times per publication based on a separate publication ID.

SELECT quote FROM brain WHERE original = 1
0 rows returned

Jeff Moden SSC Guru Points: 1004706 More actions · Answer 9

yb751 (3/17/2016)
Ok, Jeff...you're starting to make a believer out of me. Hierarchyid's are all nice and dandy until you go ahead and move a node. Sure there is a built-in function to help move a node (GetReparentedValue) but it leaves it up to you fix the rest of the hierarchy after you move it. Child nodes essentially become orphaned. What I like about your method is how child nodes are preserved in the adjacency list portion if I were to change a parent of a node. I'd have to rebuild the sort path every time a change is made but I only anticipate the table having a couple thousand entries at most and would change so infrequently. I'll test my data with your method and let you know how it goes.

And, the method in the first article will whip through a million node hierarchy in just 54 seconds not only creating the SortPath but also creating the nested sets, which makes both downline and upline queries a high performance breeze.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004706 More actions · Answer 10

yb751 (3/17/2016)
Sigh, so close!
Tested, Jeff's technique and it works great. Had to make some modifications to the code to work with my data but it wasn't too bad. Was able to quickly and easily pull parents of a child or vise versa with no issues.
But...and you know there had to be one.
Using this method implies you have unique ID's and I do for the most part; for each publication. Using hierarchyid this was easy as I could create a level just below the root to keep all of the publications unique and then populate the rest of the hierarchy below those nodes. This is because hierarchyid allows you to use the same 'id' multiple times as long as the path is unique.
I'm not completely giving up on Jeff's method just yet. I'll see if I can change the code that rebuilds the nested sets so that it can do it multiple times per publication based on a separate publication ID.

When that type of problem happens, it's pretty easy to create a unique surrogate ID for each node using RowNum and then calculate the parent surrogate. Once done, just use the surrogate Parent/Child IDs instead of the duplicated natural IDs.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Y.B. SSChampion Points: 11652 More actions · Answer 11

Jeff Moden (3/17/2016)
yb751 (3/17/2016)
Sigh, so close!
Tested, Jeff's technique and it works great. Had to make some modifications to the code to work with my data but it wasn't too bad. Was able to quickly and easily pull parents of a child or vise versa with no issues.
But...and you know there had to be one.
Using this method implies you have unique ID's and I do for the most part; for each publication. Using hierarchyid this was easy as I could create a level just below the root to keep all of the publications unique and then populate the rest of the hierarchy below those nodes. This is because hierarchyid allows you to use the same 'id' multiple times as long as the path is unique.
I'm not completely giving up on Jeff's method just yet. I'll see if I can change the code that rebuilds the nested sets so that it can do it multiple times per publication based on a separate publication ID.
When that type of problem happens, it's pretty easy to create a unique surrogate ID for each node using RowNum and then calculate the parent surrogate. Once done, just use the surrogate Parent/Child IDs instead of the duplicated natural IDs.

Thanks for advice Jeff. In the end I used half of your solution, the sorted path to be specific. That is to say I didn't make use of the Nested Set. Using the bowers was a pretty cool and easy way to find parents/children but it added some other issues based on my requirements. The surrogate ID's could have helped with this but it was unnecessary. My final solution was to add a PubID parameter to the stored procedure and use that ID to only create/update the sort path for that publication. When querying the table the PubID is obviously required to pull the right categories.

I tested this using real data in a test environment and it is very fast. 😀

Cheers!

SELECT quote FROM brain WHERE original = 1
0 rows returned

Jeff Moden SSC Guru Points: 1004706 More actions · Answer 12

@yb751,

NP. Glad the SortPath helped! Thanks for the feedback.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)