Stairway to XML

Stairway to XML: Level 5 - The XML exist() and nodes() Methods

,

In the last Level of this series, I introduced you to the query() and value() methods, which are available to the XML data type and can be used to query data from an XML instance. As you’ll recall, the query() method returns a subset of untyped XML from the target XML column (or other XML object), and the value() method returns a scalar value of a specified data type.

In this Level, I introduce you to two more XML methods: exist() and nodes(). Like the query() and value() methods, the exist() and nodes() methods let you query XML data by specifying an XQuery expression. However, the results returned by the methods are much different from query() and value(). The exist() method returns a BIT value, and the nodes() method returns a rowset view used to shred the XML instance. This will all become clearer as we work through the exercises.

Note

As mentioned in the last Level, XQuery is a complex language. We can touch upon only some of its elements in this Level. For a more thorough understanding of XQuery and how it’s implemented in SQL Server, see the MSDN XQuery language reference.

To demonstrate the exist() and nodes() methods, I used the same test environment I set up in the last Level. I created a database named ClientDB, an XML schema collection named ClientInfoCollection, and a table named ClientInfo (all created on a local instance of SQL Server 2008 R2), as shown in Listing 1.

USE master;
GO
IF DB_ID('ClientDB') IS NOT NULL
DROP DATABASE ClientDB;
GO
CREATE DATABASE ClientDB;
GO
USE ClientDB;
GO
IF OBJECT_ID('ClientInfoCollection') IS NOT NULL
DROP XML SCHEMA COLLECTION ClientInfoCollection;
GO
CREATE XML SCHEMA COLLECTION ClientInfoCollection AS 
'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
xmlns="urn:ClientInfoNamespace" 
targetNamespace="urn:ClientInfoNamespace" 
elementFormDefault="qualified">
  <xsd:element name="People">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="Person" minOccurs="1" maxOccurs="unbounded">
          <xsd:complexType>
            <xsd:sequence>
              <xsd:element name="FirstName" type="xsd:string" minOccurs="1" maxOccurs="1" />
              <xsd:element name="LastName" type="xsd:string" minOccurs="1" maxOccurs="1" />
              <xsd:element name="FavoriteBook" type="xsd:string" minOccurs="0" maxOccurs="5" />
            </xsd:sequence>
            <xsd:attribute name="id" type="xsd:integer" use="required"/>
          </xsd:complexType>
        </xsd:element>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>';
GO
IF OBJECT_ID('ClientInfo') IS NOT NULL
DROP TABLE ClientInfo;
GO
CREATE TABLE ClientInfo
(
  ClientID INT PRIMARY KEY IDENTITY,
  Info_untyped XML,
  Info_typed XML(ClientInfoCollection)
);
INSERT INTO ClientInfo (Info_untyped, Info_typed)
VALUES
(
  '<?xml version="1.0" encoding="UTF-8"?>
  <People>
    <Person id="1234">
      <FirstName>John</FirstName>
      <LastName>Doe</LastName>
    </Person>
    <Person id="5678">
      <FirstName>Jane</FirstName>
      <LastName>Doe</LastName>
    </Person>
  </People>',
  '<?xml version="1.0" encoding="UTF-8"?>
  <People xmlns="urn:ClientInfoNamespace">
    <Person id="1234">
      <FirstName>John</FirstName>
      <LastName>Doe</LastName>
    </Person>
    <Person id="5678">
      <FirstName>Jane</FirstName>
      <LastName>Doe</LastName>
    </Person>
  </People>'
);

Listing 1: Setting up the test environment for the examples in this Level

Notice in Listing 1 that the ClientInfo table includes two XML columns, one untyped and one typed. In addition, the listing includes an INSERT statement that adds a row to the table. If you decide to try out the examples in this Level, you’ll need to run this code before proceeding.

The XML exist() Method

The exist() method tests for the existence of an element in the targeted XML instance. That element is specified by the XQuery expression passed into the method. As the following syntax shows, you enclose the expression in single quotes and parentheses:

db_object.exist('xquery_exp')

The db_object placeholder is the XML column, variable, or parameter that contains the targeted XML instance, and the xquery_exp placeholder is an expression made up of the XQuery elements supported by SQL Server.

So far, this is all pretty much like the query() and value() methods described in the last Level. However, the difference comes in the results being returned. The exist() method does not retrieve an XML element or one of its values, but instead returns one of the following values, based on the existence of the element specified in the XQuery expression:

  • A BIT value of 1 if the XQuery expression returns a nonempty result, that is, if the element exists.
  • A BIT value of 0 if the XQuery expression returns an empty result, that is, if the element does not exist.
  • A NULL value if the XML instance is null.

The best way to understand how the exist() method works is to see it in action. In Listing 2, I use the method to test for the existence of a <FirstName> element with a value equal to Jane. Notice that I’ve enclosed the element in brackets and used double quotes for the string value.

SELECT 
  Info_untyped.exist(
    '/People/Person[FirstName="Jane"]')
FROM ClientInfo;

Listing 2: Verifying the existence of a specific element

Because the specified element exists in the targeted XML instance, the SELECT statement returns the value 1, which is the result we expected.

As you can see, it’s a relatively straightforward process to use the exist() method against an untyped XML column. To use the method against a typed XML object, you must include the required namespace-related information, as shown in Listing 3.

SELECT 
  Info_untyped.exist(
    '/People/Person[FirstName="Jane"]'),
  Info_typed.exist(
    'declare namespace ns="urn:ClientInfoNamespace";
    /ns:People/ns:Person[ns:FirstName="Jane"]')  
FROM ClientInfo;

Listing 3: Verifying the existence of an element in a typed column

As you would expect (if you read Level 4), your xquery_exp argument is divided into two parts, separated by a semi-colon, with the entire argument enclosed in single quotes. The first part is the namespace declaration, which specifies the namespace used by the targeted XML instance and defines an alias (ns) for that namespace. The second part of the argument is the path name itself, with ns: inserted before each element. Because the typed column, like the untyped column, contains the element specified by the XQuery expression, it too returns a value of 1.

But suppose the element specified in the expression does not exist, as is the case in Listing 4. This time, for both the untyped and typed columns, the exist() method is looking for a <FirstName> element whose value is equal to Ralph.

SELECT 
  Info_untyped.exist(
    '/People/Person[FirstName="Ralph"]'),
  Info_typed.exist(
    'declare namespace ns="urn:ClientInfoNamespace";
    /ns:People/ns:Person[ns:FirstName="Ralph"]')
FROM ClientInfo;

Listing 4: Testing for a nonexistent value in the typed and untyped columns

Of course, such an element does not exist. As a result, the exist() method, and by extension the SELECT statement, returns a value of 0 for both columns.

So far, the examples have merely provided a way to demonstrate how the exist() method works. In reality, you’re much more likely to use the method to check the existence of an element before carrying out another operation.

For example, in Listing 5, I use the exist() method in a SELECT statement’s WHERE clause to check for the existence of a <Person> element whose id attribute has a value equal to 5678. If this element exists—that is, the value returned by the exist() method equals 1—the condition specified in the WHERE clause evaluates to TRUE and the data queried in the SELECT list is returned.

SELECT
  ClientID,
  Info_untyped.value(
    'concat((/People/Person[@id=5678]/FirstName)[1], " ", 
      (/People/Person[@id=5678]/LastName)[1])', 
    'varchar(25)') AS FullName
FROM ClientInfo
WHERE
  Info_untyped.exist(
    '/People/Person[@id=5678]') = 1;

Listing 5: Using the WHERE clause to test for the existence of an attribute value

The SELECT list itself uses the concat() XQuery function and XML value() method to concatenate the first and last names associated with the <Person> element whose id value equals 5678. Listing 6 shows the results retuned by the SELECT statement. If the WHERE clause had evaluated to FALSE, the statement would have returned no rows.

ClientID    FullName
----------- -------------------------
1           Jane Doe

Listing 6: The results returned after an attribute’s existence has been confirmed

SQL Server also lets you pass variable values into your XQuery expression, which is handy if you want to reuse code. In Listing 7, for instance, I modified the preceding example so that the value 5678 could be passed in through the @id variable.

DECLARE @id INT;
SET @id = 5678;
SELECT
  ClientID,
  Info_untyped.value(
    'concat((/People/Person[@id=sql:variable("@id")]/FirstName)[1],
      " ", (/People/Person[@id=sql:variable("@id")]/LastName)[1])', 
    'varchar(25)') AS FullName
FROM ClientInfo
WHERE
  Info_untyped.exist(
    '/People/Person[@id=sql:variable("@id")]') = 1;

Listing 7: Using a variable to pass a value into an XQuery expression

Notice that, in order to call the variable value from within the XQuery expression, I specified sql:variable("@id"), rather than 5678. Everything else about the SELECT statement is the same as the preceding example, and, as expected, the statement returns the same results.

The XML nodes() Method

Of all the XML methods we’ve discussed so far, the nodes() method is perhaps the trickiest to understand. Unlike the previous methods, which return XML fragments or scalar values, the nodes() method returns a table (rowset view) with a single column, and each row of that table contains a logical copy of the targeted XML instance. The purpose of these results is to let you shred the targeted XML instance into relational data. (This will become clearer as we work through the examples.)

Because the nodes() method returns the data as a rowset view, you can use that method only where a table expression is expected in a Transact-SQL statement, such as in the FROM clause. In addition, you must assign table and column aliases to the method’s results, as shown in the following syntax:

db_object.nodes('xquery_exp') AS table_alias(column_alias)

As with other XML methods, you must specify an XML object and an XQuery expression. But you then follow with the table alias and column alias, in parentheses. The aliases let you reference the rowset view from other parts of the SELECT statement.

The key to understanding how to use the nodes() method is in the concept of a context node. Every XML document has an implicit context node, which is at the top level of the XML instance. You can think of the context node as a reference point within the XML instance. When you use the nodes() method, the context node is set to a specific element within each row of data returned by the method. That context node is identified by the XQuery expression you pass into the method. It’s the context node that lets you shred the XML data in a meaningful way.

Let’s look at an example to demonstrate how this works. But first, we need to add a row to our table. Listing 8 shows the INSERT statement I used to add the row, which adds data only to the untyped column (because that’s all we need right now).

INSERT INTO ClientInfo (Info_untyped)
VALUES
(
  '<?xml version="1.0" encoding="UTF-8"?>
  <People>
    <Person id="4321">
      <FirstName>Jack</FirstName>
      <LastName>Smith</LastName>
    </Person>
    <Person id="8765">
      <FirstName>Jill</FirstName>
      <LastName>Smith</LastName>
    </Person>
  </People>'
);

Listing 8: Inserting an additional row into the ClientInfo table

Now let’s get down to the example. In Listing 9, I use the nodes() method in the FROM clause to return a rowset view of the targeted XML instances. When I call the method, I include an XQuery expression that sets the context node to /People/Person. I also provide the table and column aliases, People and Person, respectively, so I can reference the rowset view in the SELECT list. In addition, I use the nodes() method along with the CROSS APPLY operator in order to associate the ClientInfo table with the rowset view.

SELECT 
  ClientID,
  Person.query('.') AS Person
FROM ClientInfo CROSS APPLY 
  Info_untyped.nodes('/People/Person') AS People(Person);

Listing 9: Using the nodes() method to shred XML data

Although the nodes() method returns a rowset view that you can reference in other parts of the statement, you can refer to that view only through an XML method, as I’ve done in the SELECT list. However, as you can see, my XQuery expression is merely a period, which is shorthand for referencing the context node. Because of this, the SELECT statement returns the results shown in Table 1. (I put the results in a table to make it easier to read the XML instances in the Person column.)

ClientIDPerson
1<Person id="1234">

<FirstName>John</FirstName>

<LastName>Doe</LastName>

</Person>

1<Person id="5678">

<FirstName>Jane</FirstName>

<LastName>Doe</LastName>

</Person>

2<Person id="4321">

<FirstName>Jack</FirstName>

<LastName>Smith</LastName>

</Person>

2<Person id="8765">

<FirstName>Jill</FirstName>

<LastName>Smith</LastName>

</Person>

Table 1: The returned data relative to the context node

Notice that the results include the <Person> element for each person in each row of the ClientInfo table. In other words, because each row in the source table contains two instances of the <Person> element, the rowset view includes two rows for each row in the table, one for each <Person> element. SQL Server then uses the context node to iterate through each XML instance in the rowset view and to return the appropriate instance of <Person>.

Keep in mind that it is the context node that provides the ability to return different results for each row in the rowset view, not the rowset view itself. As you’ll recall, each row in the rowset view contains a full copy of the targeted XML instance. SQL Server iterates through each instance based on the element specified by the context node, similar to how a cursor identifies a current row. However, if you were to call the parent of the context node, your results would be much different. For example, in Listing 10, I use the double period for the query() method’s XQuery expression, which is shorthand for the context node’s parent.

SELECT 
  ClientID,
  Person.query('..') AS Person
FROM ClientInfo CROSS APPLY 
  Info_untyped.nodes('/People/Person') AS People(Person);

Listing 10: Using the context node’s parent accessor to return data

Because we’re calling the context node’s parent, every child element within that parent is also returned, as shown in Table 2. As a result, unique elements are no longer returned for each row in the target table. What this demonstrates, essentially, is that each row in the rowset view contains the entire XML instance.

ClientIDPerson
1<People>

<Person id="1234">

<FirstName>John</FirstName>

<LastName>Doe</LastName>

</Person>

<Person id="5678">

<FirstName>Jane</FirstName>

<LastName>Doe</LastName>

</Person>

</People>

1<People>

<Person id="1234">

<FirstName>John</FirstName>

<LastName>Doe</LastName>

</Person>

<Person id="5678">

<FirstName>Jane</FirstName>

<LastName>Doe</LastName>

</Person>

</People>

2<People>

<Person id="4321">

<FirstName>Jack</FirstName>

<LastName>Smith</LastName>

</Person>

<Person id="8765">

<FirstName>Jill</FirstName>

<LastName>Smith</LastName>

</Person>

</People>

2<People>

<Person id="4321">

<FirstName>Jack</FirstName>

<LastName>Smith</LastName>

</Person>

<Person id="8765">

<FirstName>Jill</FirstName>

<LastName>Smith</LastName>

</Person>

</People>

Table 2: The returned data based on the parent accessor

Chances are, if you’re going to use the nodes() method to shred an XML instance, you’ll want to do it in a more meaningful way than what I’ve done so far. In Listing 11, I use the value() method and concat() function to return the full name for each instance of the <Person> element.

SELECT 
  ClientID,
  Person.value('concat(./FirstName[1], " ",
    ./LastName[1])', 'varchar(30)') AS FullName
FROM ClientInfo CROSS APPLY 
  Info_untyped.nodes('/People/Person') AS People(Person);

Listing 11: Using the context node to return values from child elements

Because I’m using the value() method, my results from the shredded XML are now returned as VARCHAR values, as shown in Listing 12.

ClientID  FullName
--------  ----------
1         John Doe
1         Jane Doe
2         Jack Smith
2         Jill Smith

Listing 12: Shredded XML retuned by the SELECT statement

As you can see, the results are now much more useful. The first and last names of each person listed in the two rows of the ClientInfo table are now returned as relational data. It can take some practice to get used to using the nodes() method, but when you do get it figured out, you’ll find it a useful tool.

Conclusion

As this Level has demonstrated, you can use the exist() method to check the existence of an element within an XML document or fragment. Most often, you’ll be using the method in the WHERE clause to verify an element’s existence before proceeding with the rest of the statement. The nodes() method serves a different function. It lets you shred an XML instance and return the information as relational data. In the next level, I’ll discuss the modify() method, which is the only XML method that lets you manipulate XML data. In the meantime, don’t forget to review the XQuery Language Reference so you better understand how to write XQuery expressions.

This article is part of the parent stairway Stairway to XML

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating