As a follow-up to my last post on grabbing node names and values, I was asked about grabbing URIs from more complex XML documents. The following query is the result of that discussion:
DECLARE @x xml;
SET @x = N'<?xml version = "1.0"?> <root xmlns:n = "uri:components:name" xmlns:p = "uri:components:people"> <p:person> <name> <n:first-name>Joe</n:first-name> <n:last-name>Johnson</n:last-name> </name> <name> <n:first-name>Lennox</n:first-name> <n:last-name>Lewis</n:last-name> </name> <name> <n:first-name>Louie</n:first-name> <n:last-name>McDougal</n:last-name> </name> </p:person> </root>';SELECT CASE NodeUri WHEN '' THEN '' ELSE '{' + NodeUri + '}' END + NodeName AS [Name], NodeUri, NodeName, NodeValueFROM( SELECT node.value('fn:namespace-uri(.[1])', 'varchar(1000)') AS NodeUri, node.value('fn:local-name(.[1])', 'varchar(1000)') AS NodeName, node.query('./text()') AS NodeValue FROM @x.nodes(N'//*') T(node)) sub;
The result looks like this:
The first column, Name, represents the fully-expanded URI + local name in the {URI}local-name format. This is what XQuery processors, like the one in SQL Server, see once the namespace prefixes are expanded and prepended to local names. Notice that nodes with no namespace don't have anything prepended. The NodeURI column is generated with a call to the fn:namespace-uri() function, and it represents the URI portion of the namespace. The NodeName and NodeValue columns contain the local name of each node and the value, respectively, just like in the last post.
XQuery supports several functions and operators like these specifically for nodes, and SQL Server supports most of them.
I got a question the other day about grabbing node names and values from an XML document. The person asking was using string parsing with the SUBSTRING function to split out names from content based on '>' characters in the XML string. Not a fullproof solution, to be sure, but quick and easy based on his predictable XML data. So the question then becomes is there an XML-based method to grab this information?
DECLARE @x xml;SET @x = N'<?xml version = "1.0"?><root> <person> <name> <first-name>Joe</first-name> <last-name>Johnson</last-name> </name> <name> <first-name>Lennox</first-name> <last-name>Lewis</last-name> </name> <name> <first-name>Louie</first-name> <last-name>McDougal</last-name> </name> </person></root>';SELECT node.query('fn:local-name(.)') AS NodeName, node.query('./text()') AS NodeValueFROM @x.nodes(N'//*') T(node);
The fn:local-name() function retrieves the local name portion of XML node names, and the text() node test retrieves the node text, as shown below.
In a previous post I talked about performing "inner joins" in XQuery. The basic idea is that the inner join is simply a special case of a Cartesian product, or "cross join". In this post I'm going to briefly look at another common SQL-style join condition that can be duplicated to some degree in XQuery. Specifically I'm going to give an example of a retrieving nodes from a tuple stream that don't match the nodes in a second tuple stream.
In this post we'll duplicate about 50% of the SQL "left outer join" functionality. For those who aren't familiar with SQL left outer joins, they can be thought of as an inner join between two tables unioned together with the rows of the left-hand table that have no corresponding rows in the right-hand table.
For this example I'm going to borrow some slightly modified XML data from the previous post:
DECLARE @xml xml;-- Create sample XML documentSET @xml = N'<authors> <author id = "1">Fabio Claudio Ferracchiati</author> <author id = "2">Hugo Kornelis</author> <author id = "3">Rob Walters</author> <author id = "4">Lara Rubbelke</author> <author id = "5">Adam Machanic</author> <author id = "6">Michael Coles</author> <author id = "7">Robin Dewson</author> <author id = "8">Jan D. Narkiewicz</author> <author id = "9">Robert Rae</author></authors><books> <book title = "Pro T-SQL 2005 Programmer's Guide"> <isbn>159059794X</isbn> <author>6</author> </book> <book title = "Accelerated SQL Server 2008"> <isbn>1590599691</isbn> <author>3</author> <author>6</author> <author>7</author> <author>1</author> <author>8</author> <author>9</author> </book> <book title = "Pro SQL Server 2008 XML"> <isbn>1590599837</isbn> <author>6</author> </book></books>';-- Perform outer joinSELECT @xml.query('for $author in /authors/author where fn:empty($author[@id = /books/book/author]) return <author> { $author } </author>');
This sample is available as a download file here.
The XML data consists of a list of authors and a list of books. The query uses a FLWOR expression to retrieve all authors who don't have a corresponding book node. Here's how it works:
The results are shown below:
<author> <author id="2">Hugo Kornelis</author></author><author> <author id="4">Lara Rubbelke</author></author><author> <author id="5">Adam Machanic</author></author>
As you can see, the three authors with id's of (2, 4, 5) were returned because they have no corresponding book nodes in the XML data. This technique, combined with the previously discussed inner join technique, can be used to simulate SQL style outer joins in XQuery.
Part of the promise of XML is that it makes modeling hierarchical data relatively painless. The hierarchical structure of XML automatically provides context to your data. Consider the following simple example:
<General name="Grant"> <Colonel name="Sanders"> <Major name="Payne"> <Captain name="Calamity"> <Sergeant name="Bilko"> <Private name="Benjamin" /> </Sergeant> </Captain> </Major> </Colonel></General>
As you can see, the hierarchical structure is implicit in the structure of the XML. The General is above the Colonel is above the Major, and so on. Organizational charts, military chains of command, and other similar structures are standard fare for hierarchical modeling. Needless to say, the AdventureWorks database has a standard organizational hierarchy modeled as relational data which you'll probably see used in a whole lot of samples all over the place.
But AdventureWorks also includes a more detailed, but often-overlooked, real-world example of hierarchy. The AdventureWorks Bill of Materials (BOM) hierarchy is modeled as an adjacency list hierarchy; one in which the child nodes maintain links to their parent nodes in the same table. You can query the Bill of Materials using the dbo.uspGetBillOfMaterials procedure like this:
EXEC dbo.uspGetBillOfMaterials 749, '2001-01-01'
The results are a "flat" relational listing of the raw materials used to build the "Road-150 Red, 62" bike.
The results do provide the parent and child component ID numbers, as well as the level of the component in the BOM. But they do not convey the hierarchical information without some sort of reconstruction on your part - probably performed in the UI for display purposes. All of the information required to reconstruct the hierarchical structure of the BOM is available in this result set, but it's not readily apparent. You can use FOR XML PATH and the xml data type to reconstruct the hierarchical structure of the BOM with a simple recursive function like this:
CREATE FUNCTION dbo.GetRecursiveBom(@ProductID int, @level int = 0)RETURNS xmlASBEGIN DECLARE @x xml; SET @x = ( SELECT bom.ComponentID AS "@id", p.ProductNumber AS "@number", @level AS "@level", p.Name AS "name", p.Color AS "color", p.ListPrice AS "list-price", bom.PerAssemblyQty AS "quantity", p.Size AS "size", p.SizeUnitMeasureCode AS "unit-of-measure", ( SELECT CASE WHEN @level < 5 THEN dbo.GetRecursiveBom(bom.ComponentID, @level + 1) ELSE '' END ) FROM Production.BillOfMaterials bom INNER JOIN Production.Product p ON bom.ComponentID = p.ProductID WHERE @ProductID = CASE WHEN @level = 1 THEN bom.ComponentId ELSE bom.ProductAssemblyID END FOR XML PATH (N'item'), TYPE ); RETURN @x;ENDGO
This function calls itself recursively to retrieve a BOM (up to 5 levels deep) in XML format. You can call the function like this:
SELECT dbo.GetRecursiveBom (749, 0);
And the resulting XML looks like the image below:
As you can see the item nodes are nested within each other, with each component acting as a container for its constitutent components. The code for this function is available as a download file here: GetRecursiveBom.sql.txt
In a future post we'll look at how to create a "recursive" XML schema to validate the XML produced by this function.