April 16, 2011 at 11:36 am
Hi guys I have a quick question about using hierarchyid for storing url's in the database. Seems like the most optimal solution. So for example if url is /Content/News/October.... to find this in a hierarchy table and get the id.
Here is what I have so far:
CREATE TABLE [dbo].[Table_1](
[id] [hierarchyid] NOT NULL,
[urlName] [varchar](50) NOT NULL,
CONSTRAINT [PK_Table_1] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Now just inserting random stuff into it
INSERT INTO [Table_1] ([Id], [UrlName]) VALUES
(hierarchyid::GetRoot(), '/', 0),
('/1/', 'Content'),
('/1/1/', 'Articles'),
('/1/2/', 'News'),
('/1/2/3', 'October')
there is also a plain index on urlName column.
Here is what I'm trying to do. I get a url passed in and from there traverse the tree to get to find the node.
declare @url varchar(1000) = '/Content/News/October'
DECLARE @parentId hierarchyid
select @parentId = id from Table_1 where urlName = 'Content';
select * from Table_1
where id.IsDescendantOf(@parentId) = 1
However I get stuck right out here...not sure how to continue going through. Am I just better off writing parent/child int id columns and self join?
Thanks for any advice.
April 16, 2011 at 2:17 pm
Inherently, URL's are already a form of hierarchy "ID". Are you sure it's worth using HierarchyID for this?
--Jeff Moden
Change is inevitable... Change for the better is not.
April 16, 2011 at 9:14 pm
The reason I think it makes sense is because hierarchyid is stored in leafnodes so that when I have to get the id (which is what I'm after) by parsing the url it might be less work on the server by jumping through leaf nodes vs mugging entire table.
I've never implemented this type of ordeal, so open to suggestions. Just seems like hierarchyid is meant for stuff like this. But it might not be the best, I don't really know.
April 16, 2011 at 10:20 pm
If you take a look at the hierarchy ID column you posted, it's just numbers that represent the "layers" of the URL. Your problem is now how to put those layers back together to find a single leaf node and it's difficult because you have to first find each HierarchyID for each layer of the URL before you can look up the URL. It'll be a lot faster if you just store the URL layers. If you need to find all member of a given layer, then a simple Adjacency List (you called it a Parent Child list) converted to a HierarchyID would be better provided they lived side-by-side in the same table.
Of course, the Nirvana of such Hierarchical things would be to convert to a Nested Set.
If you could post your table definition and the data you're trying to turn into a hierarchy (see the first link in my signature line below for how to do that correctly for a forum), I might be able to better show you what I mean.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 16, 2011 at 10:59 pm
AssetLine.com is the only truly global online exchange where the risk is taken out of buying and selling used construction equipment over the internet.
April 17, 2011 at 6:20 pm
Thanks Jeff,
Here is what we have now:
Table and it's indexes
CREATE TABLE [dbo].[UrlContent](
[pageid] [int] NOT NULL,
CONSTRAINT [PK_UrlContent] PRIMARY KEY CLUSTERED
(
[pageid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_PADDING OFF
GO
/****** Object: Index [IX_UrlContent] Script Date: 04/17/2011 17:06:05 ******/
CREATE UNIQUE NONCLUSTERED INDEX [IX_UrlContent] ON [dbo].[UrlContent]
(
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
sample data (just created a while loop to generate insert statement:
INSERT INTO [dbo].[UrlContent]
([pageid]
VALUES
(10
,'Content/News/October/SomeFunArticle'),
(11
,'Content/News/October/SomeArticleTitle/Page1'),
/* printing random data to build a simple insert statement to shrink it. But our table has thousands of rows. */
declare @counter int = 2
while (@counter < 500)
begin
print '(' + cast(@counter as varchar) + ', ''Content/News/October/SomeArticleTitle/Page' + cast(@counter as varchar) + '''),'
set @counter = @counter + 1
end
We are running into issues with this setup because there is a lot of duplicated content and it is also very difficult to find child nodes. For example all news for October. There isn't a hierarchy of one item being 'Content' other being 'News' with some parent/child relationship.
What would you advice?
Thank you very much,
Chris
April 17, 2011 at 7:19 pm
chrisLosAngles (4/17/2011)
What would you advice?
"Consistency and extensibility".
First, you're going to hate yourself in another year when another "October" comes around. You MUST include the year for your "volume" markers.
Second, the ID in the table isn't a "page_id". It's an ID for the row of the table and should be named after the table itself.
After that, everything is child's play and you don't necessarily need a HierarchyID to do what you need.
With all that in mind, please consider the following table setup, the test data for 3 different months of articles across 2 different years, and some example queries.
--DROP TABLE dbo.UrlContent
GO
CREATE TABLE dbo.UrlContent
(
UrlContentID INT IDENTITY(1,1) NOT NULL,
Url VARCHAR(500) NOT NULL,
CONSTRAINT PK_UrlContent PRIMARY KEY CLUSTERED (UrlContentID ASC)
)
;
CREATE UNIQUE NONCLUSTERED INDEX AK_UrlContent_Url ON dbo.UrlContent (Url ASC)
;
-----------------------------------------------------------------------------------------------------
INSERT INTO dbo.UrlContent
(Url)
SELECT 'Content/News/201010/SomeFunArticle/'
UNION ALL
SELECT 'Content/News/201109/SomeArticleTitle/'
UNION ALL
SELECT 'Content/News/201110/SomeOtherArticleTitle/'
UNION ALL
SELECT TOP 500
Url = 'Content/News/201110/SomeArticleTitle/Page'
+ RIGHT('0000'+CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(10)),4)
FROM master.sys.all_columns ac1,
master.sys.all_columns ac2
UNION ALL
SELECT TOP 500
Url = 'Content/News/201109/SomeFunArticle/Page'
+ RIGHT('0000'+CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(10)),4)
FROM master.sys.all_columns ac1,
master.sys.all_columns ac2
UNION ALL
SELECT TOP 500
Url = 'Content/News/201110/SomeOtherArticleTitle/Page'
+ RIGHT('0000'+CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS VARCHAR(10)),4)
FROM master.sys.all_columns ac1,
master.sys.all_columns ac2
;
-----------------------------------------------------------------------------------------------------
--===== Find just the article titles written in Oct 2011
SELECT *
FROM dbo.UrlContent
WHERE Url LIKE 'Content/News/201110/%/'
ORDER BY URL
;
--===== Find just a single article and its pages
SELECT *
FROM dbo.UrlContent
WHERE Url LIKE 'Content/News/201110/SomeOtherArticleTitle/%'
ORDER BY URL
;
--===== Find just the article titles written in Oct 2010
SELECT *
FROM dbo.UrlContent
WHERE Url LIKE 'Content/News/201010/%/'
ORDER BY URL
;
--===== Find all the articles written in 2011
SELECT *
FROM dbo.UrlContent
WHERE Url LIKE 'Content/News/2011__/%/'
ORDER BY URL
;
--===== Find the articles written in October of any year
SELECT *
FROM dbo.UrlContent
WHERE Url LIKE 'Content/News/____10/%/'
ORDER BY URL
;
Like I said, proper coded Url's make their own Hierarchy and can be easily queried.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 19, 2011 at 8:08 pm
Thanks, I think that will work for now.
April 20, 2011 at 6:51 am
You bet. Thank you for the feedback.
--Jeff Moden
Change is inevitable... Change for the better is not.
April 22, 2011 at 7:13 pm
I'm now running into another snag with this setup actually. I need to build a website hierarchy menu out of this (it doesn't have to be based on url's specifically, can be based off the pages themselves). But not sure how to set this up to be efficient.
We have the other table for storing other page info from which I'm trying to build the menu. (Url's are a separate table)
The column ParentpageId is new, and I can populate it as necessary for now. And I was thinking maybe that could be a candidate for hierarchyid or not.
CREATE TABLE [dbo].[Pages](
[PageId] [int] PRIMARY KEY IDENTITY(1,1) NOT NULL,
[PageTitle] [varchar](50) NOT NULL,
[ParentPageId] [int] NOT NULL,
[UrlId] [int] NOT NULL,
[PageOrder] [int] -- hierarchyid maybe?
)
Is there an efficient query that can be performed to build the menu? Anything you can advise?
Thank you very much,
Chris
April 23, 2011 at 11:20 am
Heh... Yep. In fact, that's such a common request that I wrote an "SQL Spackle" article on the subject. Please see the following link...
http://www.sqlservercentral.com/articles/T-SQL/72503/
--Jeff Moden
Change is inevitable... Change for the better is not.
April 25, 2011 at 10:46 am
Thanks again Jeff,
I just read the article. And it is just awesome! Exactly what I needed.
Thanks,
Chris
April 25, 2011 at 3:30 pm
Very cool. Glad I could help and thank you for the feedback!
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 13 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy