Stairway to XML

Stairway to XML: Level 9 - Creating XML-Based Functions

,

In previous Levels, we looked at the methods available to the XML data type that let you view and modify specific components of an XML instance. We also reviewed a number of examples that demonstrated different ways you can use the methods. However, those examples were limited primarily to basic Transact-SQL queries. But you can also use the methods in such database objects as user-defined functions, stored procedures, and views. In this Level, we’ll look at how to use XML methods within user-defined functions to return XML fragments and values from your target XML instance.

When incorporating XML methods into your functions, you create them in much the same way you would any function. If you’re unfamiliar with how to create functions, refer to SQL Server Books Online for details about the different function types and how to define them. This Level is concerned primarily with the XML-related components.

The sections to follow include a number of examples that demonstrate how to use XML methods within your functions. If you plan to try out these examples, you should first run the code shown in Listing 1. It creates the ClientDB database and the ClientInfo table within that database. The code then inserts sample data into the table.

USE master;
GO
IF DB_ID('ClientDB') IS NOT NULL
DROP DATABASE ClientDB;
GO
CREATE DATABASE ClientDB;
GO
USE ClientDB;
GO
IF OBJECT_ID('ClientInfo') IS NOT NULL
DROP TABLE ClientInfo;
GO
CREATE TABLE ClientInfo
(
  ClientID INT PRIMARY KEY IDENTITY,
  Info XML
);
INSERT INTO ClientInfo (Info)
VALUES
(
 '<People>
    <Person id="1234">
      <FirstName>John</FirstName>
      <LastName>Doe</LastName>
    </Person>
    <Person id="5678">
      <FirstName>Jane</FirstName>
      <LastName>Doe</LastName>
    </Person>
    <Person id="2468">
      <FirstName>John</FirstName>
      <LastName>Smith</LastName>
    </Person>
    <Person id="1357">
      <FirstName>Jane</FirstName>
      <LastName>Smith</LastName>
    </Person>
  </People>'
);

Listing 1: Setting up the initial test environment

If you’ve worked through the previous Levels, you’ll notice that the code shown in Listing 1 doesn’t include a typed XML column in the ClientInfo table. For the purposes of demonstrating how to use the XML methods within user-defined functions, this Level focuses on how that is done, rather than distinguishing between typed and untyped XML objects. Just know that if you’re working with typed XML, you should follow the same processes you saw in previous Levels to access the data. You must declare your namespace and provide the proper namespace references in your element paths when referencing an XML instance in an XML object. That said, let’s get started on how to create functions that access XML data.

Creating XML-Based Functions

Before we get into using an XML method within a user-defined function, let’s first look at a function that returns an XML instance. The code in Listing 2 creates the udfClient function, which retrieves the XML instance from the Info column of the ClientInfo table, based on the specified client ID. The function returns the instance as a single XML value.

IF OBJECT_ID('udfClient') IS NOT NULL
DROP FUNCTION udfClient;
GO
CREATE FUNCTION udfClient (@ClientID INT)
RETURNS XML
AS BEGIN
RETURN
(
  SELECT Info
  FROM ClientInfo
  WHERE ClientID = @ClientID
)
END;
GO

Listing 2: Creating a function that returns XML data

The udfClient function is itself very straightforward. It includes a SELECT statement whose WHERE clause limits the results to the row associated through the ClientID value, as specified by the @ClientID parameter value passed into the function when calling it. Notice that the RETURNS clause specifies the XML data type. That means the SELECT statement must return a scalar value that conforms to that data type, which it does.

Once you’ve created the function, you can use a SELECT statement, such as the one shown in Listing 3, to test that the function returns the results you expect. Note that you must specify the schema name when calling a user-defined function, even if that function was created within the dbo schema. You must also specify a value to pass in as an argument that identifies the client ID (in this case, 1).

SELECT dbo.udfClient(1);

Listing 3: Testing the udfClient user-defined function

When you run the SELECT statement, it should return the entire XML instance that you inserted into the table when you first set up your test environment. You can refer back to Listing 1 to verify that the statement returns the correct data.

Because the function returns an XML value, you can use the XML methods when calling the function. For example, the SELECT statement in Listing 4 uses the query() method to return the <FirstName> element for the person whose id attribute value is 1234.

SELECT dbo.udfClient(1).query(
  '/People/Person[@id=1234]/FirstName');

Listing 4: Using the query() method when calling your function

Notice that when calling the udfClient function, you add a period after the function’s closing parenthesis, followed by the method name. You then pass in the necessary XQuery expression as an argument to the method, just like you saw in previous Levels. Now the function returns only the value <FirstName>John</FirstName>.

If you want to return only an element’s value, and not its tags, you can instead use the value() method. Just remember that, when calling this method, you must provide two arguments: the XQuery expression and the data type of the returned value. For example, the SELECT statement shown in Listing 5 retrieves the first name of the person with an id attribute value of 5678, so the second argument to the method specifies the varchar(20) data type.

SELECT dbo.udfClient(1).value(
  '(/People/Person[@id=5678]/FirstName)[1]',
  'varchar(20)') AS FirstName;

Listing 5: Using the value() method when calling your function

Now the statement returns the value Jane. As you can see, because the udfClient function returns an XML value, you can use the XML methods as you would when specifying them with columns or other XML objects.

Of course, creating a user-defined function that returns an entire XML instance will probably be useful only in rare circumstances because the complexity lies when you call the function, not when you define it. Chances are, if you plan to use a function to access XML data, you’ll want to use one or more of the XML methods within the function so you can persist complex queries to the database.

Let’s look at an example that demonstrates how this works. Listing 5 shows the code used to create the udfFullName function, which returns a list of full names for a specific client in the ClientInfo table.

IF OBJECT_ID('udfFullName') IS NOT NULL
DROP FUNCTION udfFullName;
GO
CREATE FUNCTION udfFullName (@ClientID INT)
RETURNS TABLE
AS
RETURN
(
  SELECT Person.value(
    'concat(./FirstName[1], " ", ./LastName[1])',
    'varchar(50)') AS FullName
  FROM ClientInfo CROSS APPLY
    Info.nodes('/People/Person') AS People(Person)
  WHERE ClientID = @ClientID
);
GO

Listing 6: Using the value() and nodes() methods within your function

Notice that the RETURNS clause in the CREATE FUNCTION statement specifies that the results returned by the SELECT statement conform to the TABLE data type, which is a special type used to store a tabular result set. That means our SELECT statement can return any number of rows and columns, rather than only a scalar value.

The SELECT statement itself includes the value() and nodes() methods to retrieve a list of names. The FROM clause uses the nodes() method to parse the XML instance. The clause then uses the CROSS APPLY operator to join the results returned by the nodes() method to the ClientInfo table.

The SELECT clause contains the value() method, which retrieves the names from the joined results, and the method’s XQuery expression uses the concat() function to concatenate the values from the <FirstName> and <LastName> elements.

Once you’ve defined your function, you can then use a SELECT statement similar to the one shown in Listing 7 to test that the function returns the expected results.

SELECT * FROM dbo.udfFullName(1);

Listing 7: Testing the udfFullName user-defined function

Because the udfFullName function returns data as a table, you can use the function only where a table expression is accepted, which in this case, is the FROM clause. Listing 8 shows the results returned by the SELECT statement.

FullName
-----------
John Doe
Jane Doe
John Smith
Jane Smith

Listing 8: Results returned by the udfFullName function

As you can see, a function makes it easy to persist complex code to the database so you can call it when you need it, which can be particularly handy when calling the function from within other database objects. So let’s look at how to do just that.

Using XML-Based Functions in Computed Columns

A computed column is one in which an expression is used to generate the values for that particularly column. The expression can reference other columns within the table, including XML columns.

However, it’s rare that you’ll want to create a computed column based on an entire XML instance, unless you simply plan to convert that column to another type. In most cases, you’ll probably want to use only a value or two from within the XML. The problem is, SQL Server does not let you use XML methods in a computed column. The way around this is to create a function that uses the necessary methods, and then call the function from within your computed column expression.

For example, suppose we want to add a column to our ClientInfo table that calculates the number of <Person> elements within the XML instance associated with a particular row. We would start by first creating a function similar to the one shown in Listing 9. The function uses the value() method and the XQuery count() function to return an INT value that shows the number of <Person> elements.

IF OBJECT_ID('udfPersonCount') IS NOT NULL
DROP FUNCTION udfPersonCount;
GO
CREATE FUNCTION udfPersonCount (@ClientID INT)
RETURNS INT
AS BEGIN
RETURN
(
  SELECT Info.value('count(/People/Person)', 'int')
  FROM ClientInfo
  WHERE ClientID = @ClientID
)
END;
GO

Listing 9: Creating a function that returns the number of people within an XML document

As you can see, when you call the function, you pass in the client ID. The function then returns the element count for the row associated with that ID.

You can test that the function works by using a SELECT statement, as shown in Listing 10. In this case, the function should return a scalar value of 4.

SELECT dbo.udfPersonCount(1) AS PersonCount;

Listing 10: Testing the udfPersonCount user-defined function

Now let’s update the ClientInfo table to include the computed column. Listing 11 shows an ALTER TABLE statement that adds the PersonCount column to the table. Notice that the ADD clause calls the udfPersonCount function and passes in the ClientID column as an argument.

ALTER TABLE ClientInfo
ADD PersonCount AS dbo.udfPersonCount(ClientID);
GO

Listing 11: Using the udfPersonCount function to create a calculated column

To verify your computed column, you need only retrieve the PersonCount column from the ClientInfo table, as shown in Listing 12. In this case, I’ve included a WHERE clause that limits the results to the row with a client ID of 1.

SELECT PersonCount FROM ClientInfo
WHERE ClientID = 1;

Listing 12: Verifying the data in the PersonCount calculated column

As to be expected, the SELECT statement returns the value 4. If the table had included additional rows, and you retrieved the PersonCount data from one of those rows, the value would be specific to the number of <Person> elements in that row. Now let’s look at how to incorporate the function in a check constraint.

Using XML-Based Functions in Check Constraints

As with computed columns, you cannot use the XML methods within a check constraint expression, but you can include a function in the expression, and that function can include the methods.

For example, suppose you want to ensure that a row can be added to the ClientInfo table only if the XML document contains more than one instance of the <People> element. You can create a check constraint that uses the udfPersonCount function in the constraint’s expression, as shown in Listing 13.

ALTER TABLE ClientInfo
WITH NOCHECK ADD CONSTRAINT ck_count 
  CHECK (dbo.udfPersonCount(ClientID) > 1);

Listing 13: Using the udfPersonCount function in a check constraint

As you can see, the ALTER TABLE statement adds the ck_count check constraint. The constraint’s expression compares the output from the udfPersonCount function to the value 1. For the expression to evaluate to true, the XML document must contain more than one instance of the <People> element.

Once we’ve added the check constraint to our table, we can test it by trying to add an XML document that contains only one <Person> element, as shown in Figure 14.

INSERT INTO ClientInfo (Info)
VALUES
(
 '<People>
    <Person id="1234">
      <FirstName>John</FirstName>
      <LastName>Doe</LastName>
    </Person>
  </People>'
);

Listing 14: Inserting a single <Person> instance into the Info column

Not surprisingly, this statement returns an error message (shown in Listing 15) because the INSERT statement causes the check constraint expression to evaluate to false. So the row cannot be inserted into the table.

The INSERT statement conflicted with the CHECK constraint "ck_count". The conflict occurred in database "ClientDB", table "dbo.ClientInfo", column 'ClientID'.
The statement has been terminated.

Listing 15: Error message returned by SQL Server

Now let’s try to add an XML document with two <People> elements. Listing 16 shows the XML document with the additional element.

INSERT INTO ClientInfo (Info)
VALUES
(
 '<People>
    <Person id="1234">
      <FirstName>John</FirstName>
      <LastName>Doe</LastName>
    </Person>
    <Person id="5678">
      <FirstName>Jane</FirstName>
      <LastName>Doe</LastName>
    </Person>
  </People>'
);
-- 1 row inserted

Listing 16: Inserting two <Person> instances into the Info column

This time you should receive a message saying that one row has been inserted into the table. You can confirm this by running the SELECT statement in Figure 17.

SELECT * FROM ClientInfo;

Listing 17: Verifying that a second row as been added to the ClientInfo table

As to be expected, the SELECT statement returns two rows, the original row and the new row. The original row had four instances of the <People> element, and the new one has two instances. Therefore, the calculated column PersonCount should contain the values 4 and 2, respectively.

Conclusion

Being able to use the XML methods within your functions can be a handy tool, regardless of how you plan to use those functions. Yet as this Level has demonstrated, such functions are particularly useful when implementing calculated columns or check constraints because neither supports the direct use of the XML methods. However, other objects do permit their use. In a later level, we’ll cover how to incorporate the methods into views and stored procedures. Many of the principles we covered in this Level will apply to the next one, but that Level will help to round out our discussion on the XML methods and help to give you a more complete picture of the various ways you can access data from an XML instance. In the meantime, for additional information on the topics we discussed in this Level, be sure to refer to SQL Server Books Online.

This article is part of the parent stairway Stairway to XML

Rate

5 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (1)

You rated this post out of 5. Change rating