July 14, 2006 at 11:28 am
Imagine we have a table called PathMetaData that describes what's in certain directories (i.e., this directory belongs to Mark and is regarding Project Acme). We also have a table called Files that lists all the files and subdirectories in these directories.
PathMetaData can contain information that might "overlap" multiple directories. For example, we might have a couple entries like:
In the Files table we might have:
I would like to write a query that joins the Files table with PathMetaData table *at the deepest level*. So even though \\One\Two and \\One\Two\Buckle in PathMetaData might both match \\One\Two\Buckle\document.doc, I want only the join resulting from \\One\Two\Buckle.
These tables are based on some ideas for manipulating hierarchies in O'Reilly's "Transact-SQL Cookbook", so in both PathMetaData and Files the path is actually the clustered index, so I want to do these joins based on LIKE (especially because both of these tables will contain potentially millions of rows). Here are some simplified versions of the tables and data:
PathID Int Identity,
Path VarChar(128),
Author VarChar(64),
Project VarChar(64))
DECLARE @Files TABLE (
FileID Int Identity,
FilePath VarChar(128))
INSERT INTO @PathMetaData Values('\\One\Two\', '0100', '001')
INSERT INTO @PathMetaData Values('\\One\Two\Buckle\', '0100', '002')
INSERT INTO @PathMetaData Values('\\One\Two\Buckle\My\Shoe\', '0100', '004')
INSERT INTO @PathMetaData Values('\\Three\Four\', '0200', '001')
INSERT INTO @PathMetaData Values('\\Three\Four\Close\The\Door\', '0300', '005')
INSERT INTO @Files VALUES('\\One\Two\document.doc')
INSERT INTO @Files VALUES('\\One\Two\Three\document.doc')
INSERT INTO @Files VALUES('\\One\Two\Buckle\document.doc')
INSERT INTO @Files VALUES('\\One\Two\Buckle\My\document.doc')
INSERT INTO @Files VALUES('\\Three\Four\document.doc')
INSERT INTO @Files VALUES('\\Three\Four\Close\document.doc')
INSERT INTO @Files VALUES('\\Three\Four\Close\The\Door\document.doc')
Here's one query that I've been using and it gives correct results but surely there's a better way to write this:
SELECT maxdepth.FilePath, pmd.* FROM
(
SELECT fi.FilePath, MAX(Len(pmd.Path)) MaxLen
FROM @PathMetaData pmd JOIN @Files fi
ON fi.FilePath LIKE pmd.Path + '%'
GROUP BY fi.FilePath) MaxDepth JOIN @PathMetaData pmd
ON MaxDepth.FilePath LIKE pmd.Path + '%' AND MaxDepth.MaxLen = Len(pmd.Path)
ORDER BY maxdepth.FilePath
Any thoughts on this would be greatly appreciated.
Thanks
Mark
July 14, 2006 at 12:13 pm
SELECT fi.FilePath, MAX(pmd.Path) [Path]
@PathMetaData pmd
[edit: the second one requires an exact match for the longest subpath]
Tim Wilkinson
"If it doesn't work in practice, you're using the wrong theory"
- Immanuel Kant
Viewing 2 posts - 1 through 1 (of 1 total)
You must be logged in to reply to this topic. Login to reply