Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

XML Argument Protocols for SQL 2005 Stored Procedures

By Jack Hummer,

This article describes using some of the new XML functionality provided by MS SQL Server 2005 as applied to argument passing to T-SQL stored procedures. One of the more tedious areas of application support is managing arguments passed to and from T-SQL stored procedures. Unlike other RDBMS engines such as Oracle, T-SQL does not support the notion of passing in record structures as arguments from a client directly to a T-SQL stored procedure. Realizing that the new XML functionality might offer some relief in this area, I did some searching to see what anyone else had come up with. I found some articles related to XML, but the articles I did find used SQL 2000 and had the primary focus of using XML for bulk Inserts with no preliminary data validations or other processing steps (see References below). As I had some different objectives in mind, and I found no articles highlighting the new SQL 2005 XML features, I decided to present my ideas and results here.

Background

The use of MS SQL Server as a web back-end storage system is well understood. Maintaining the stored procedures that interact with web clients can be tedious, with the lack of record structure support to and from stored procedures a contributing factor. There are several consequences to this, including:

  • Commonly used procedures have multiple copies created with slightly different arguments
  • There are many instances where passing numerous arguments and/or varying number of arguments to stored procedures can be awkward or nearly impossible
  • Altering the arguments to a commonly used procedure can be a high-maintenance task any dependent code snippets must be changed in parallel
  • T-SQL does not include a formal mechanism to pass in record structures from a client application, much less support passing in a varying number of records

Transaction support is another labor intensive effort: In a web back end, the database context, or session, only exists as long as that one procedure call is active. Thus, a transaction that requires multiple procedure calls requires some way to wrap the transaction/rollback around the multiple invocations. For example, storing an e-commerce transaction can involve a dozen tables and multiple, varying number of rows per table (e.g. sales line items) which is really impossible to pack into standard procedure call arguments. Some people use this difficulty as a reason to embed SQL in their application code instead of placing it into a stored procedures.

Another interesting case arises when you want to send a stored procedure the minimal number of columns in a row to update. First, we have the issue of variable number of arguments (which columns are getting updated?); next, we are faced with future maintenance issues as columns are added to (or deleted from) the table; then the update code has to decipher a tri-state assertion for each column in the table: Do we leave the column alone, update it to a value, or update it to Null? Traditional argument calling sequences might involve two arguments per column, one to pass the new value, and an indicator variable to say if we are really updating or not - not pretty. The open structure of XML lets us define such things easily.

Finally, there is the reverse interface issue of sending back to the calling program multiple values, possibly structured; for this discussion I will use error messages. Suppose we have a row insert procedure, and we do a number of validations on the column values before executing the Insert command. What we do not want to do is bail out after detecting the first error, go back to the user with one error, they fix it and resubmit, we bail out on the next error, etc. until the user gets dizzy. We would like to do all our data validations and pass back all the conditions and messages in one pass, so the user can fix them in one try. Here again we need to pass multiple results (independent of a dataset), which the basic T-SQL Procedure argument list makes difficult if not impossible.

Thus the issues at hand are:

  • Number of arguments (complexity), and maintenance issues (add/delete arguments)
  • Tendency to create multiple copies of the same procedure for slightly varying input conditions
  • Passing a record structure
  • Passing a varying number of multiple records
  • Adding processing criteria to arguments (e.g. no update; update with value; update to Null)
  • Pass back a varying number of (potentially structured) values, e.g. error messages.

XML As A Solution

T-SQL has native support for manipulating XML in SQL 2000, but with the release of SQL 2005, XML support is even better. T-SQL arrives with additional functions and an actual XML native datatype [essentially an nvarchar(max) with storage in "optimised UTF-16 characters" per BOL]. There are some curious comments in the BOL about the overhead of XML processing, but I have not yet tried to measure it. My general rule is, if a certain syntax or process makes for increases in programming efficiency and reliability as well as reducing long term maintenance, I care less about a little additional overhead on the machine.

Client Code

On the application (e.g. web page) side, instantiating the MSXML object to create an XML document certainly has overhead, especially as it supports a lot of esoteric XML syntax that may be seldom used. With the new SQL Server 2005 enhancements, some of the client-side processing can be entirely relieved: XML arguments can be treated as a string and passed to a stored procedure without any need for complicated objects to help us. You should certainly use the MSXML object if it fills other requirements the client-side application needs. Note: If you do not use the MSXML object, then you need to handle the translation of reserved characters into their alternate representations, a.k.a. 'entities' . For example the XML reserved character "<" [less than sign/open angle bracket] in user data must be replaced with the XML Entity "&lt;". The same holds for ">", "'", """, and "&". You could write a small function to handle those [hint: process "&" first].

Using XML makes the calling convention to stored procedures very simple: As the XML is by its nature structured, when we extract values out of the XML from inside the procedure, the code need make no assumptions about ordering of individual data items; handling data element existence or non-existence is also handled gracefully. The presence of possibly obsolete data items is no longer a bother to the stored procedure - it simply ignores them. All of this helps reduce maintenance.

Web Server Application Code

We utilize the fact that an XML document can be represented by a character string, and simply append and build it. Note, if you really wanted, you could instantiate the MSXML object and build your XML with that, but this seems far too much for the job at hand. My presumption is that all of the user input has been collected and validated to some extent by the application program, and is then to be passed to the database as a single transaction.

Example: Insert Customer Row [vbscript]

xcmd = "<arg><first_name>" & firstName & "</first_name><last_name>" & lastName &_
"</lastname></arg>"

where arg is the root tag for the whole document, and firstName, lastName etc. are the variables holding the user input. If a data item is missing, the stored procedure can fill in a default or raise an error, depending on the business rules involved.

Example: Update Customer Row [vbscript]

xcmd = "<arg><customerid>74285</customerid><first_name>Robert</first_name>" &_

 "<company_name null='yes'/></arg>"

This could be sent to an Update procedure. It provides a primary key, a new first name, and an explicit instruction to set the company name to Null; all other columns are unmodified.

Example: Create Sales Transaction [vbscript]

xcmd = "<arg><customer><first_name>Mary</first_name>. . . .</customer>" &_
 "<order_header><order_date>08/15/2006</order_date>. . . .</order_header>" &_
 "<order_lines count='6'><line productid='7294' . . .>" &_
      "<line productid='8241' . . .>. . . .</order_lines>" &_
 "<payment><type>Visa</type>. . . .</payment></arg>"

Here we consolidate multiple record types, and multiple records for sales line items, all into one procedure call. This can then easily be wrapped in a single Transaction in case of unexpected database errors. The line count is optional, but if you know it then why not send it along, the procedure can make use of it. Otherwise the procedure will have to loop on searches for order_lines until it runs out of them. Remember that in most cases the non-significant white space (blanks, new lines, etc. in between elements) are ignored and often disappear when the XML document is stored in a canonical form, such as an XML variable.

I have written web page scripts that do just this kind of processing. First you call a procedure to add the customer. Then call another procedure to add the order header. Then a big loop call to insert each sale line item. Yet another procedure to post the payment. And if one of these burps along the way? Go write another procedure to try to clean up a half-baked transaction! That kind of processing is where this interface really shines.

Example: Error Message Return Results

<err>
 <error>
 <err_code>APP001</err_code><err_msg>First Name is a required field.</err_msg>
 </error>
 <error>
 <err_code>APP002</err_code><err_msg>Last Name is a required field.</err_msg>
 </error>
</err>

This might be returned by a procedure call. Multiple messages can easily be handled, as well as a record structure (shown here) composed of both an error code as well as the error text message. Successful execution is indicated by simply returning Null.

Stored Procedure Interface and Code

The calling program (a server-side web page script, for our examples) makes a standard procedure call, passing the string that has been built as an input XML argument. plus an output XML argument for the error status return. The variable types could typically be double-wide character, e.g. Unicode.

Here is an excerpt of a stored procedure to validate values then Insert a customer row.

/************************************************************************************
file is xtest.sql Copyright (c) 2006 Jack A. Hummer
*/
CREATE PROCEDURE [dbo].[xtest]
/*
Purpose
 Example procedure demonstrating XML input and output arguments.
 Processing to validate and insert a Customer row.
Input */
@xin
XML, /* input values for columns
 <arg>
 <fname>...</fname> First Name, required
 <lname>...</lname> Last Name, required
 </arg>
 Missing fields, or fields passed as <xxxx
null="yes"/> are considered
 Null and will have defaults applied as needed.
Output */
@id_customer
INT = NULL OUTPUT, -- new primary key
@xout
XML = NULL OUTPUT /* error message(s), or Null = good return
 <err>

<error><err_code>nnn</err_code><err_msg>xxxxxxx</err_msg></error>
 </err>
Notes
 
Revision History
14aug06 jah created
************************************************************************************/
AS
DECLARE @first_name NVARCHAR(50), @last_name NVARCHAR(50), @nvl VARCHAR(3)
SET NOCOUNT ON;
SET @xout = NULL -- default is good return
 
-- process first name - required field
SET @nvl = @xin.value('(/arg/fname/@null)[1]',
'NVARCHAR(3)')
SET @first_name = @xin.value('(/arg/fname)[1]', 'NVARCHAR(50)')
IF @nvl = 'yes' OR @first_name IS NULL BEGIN

SET @xout = dbo.xerr(@xout, 'E001', 'First Name cannot be Null')
END

-- process last name - required field
SET @nvl = @xin.value('(/arg/lname/@null)[1]',
'NVARCHAR(3)')
SET @last_name = @xin.value('(/arg/lname)[1]', 'NVARCHAR(50)')
IF @nvl = 'yes' OR @last_name IS NULL BEGIN

SET @xout = dbo.xerr(@xout, 'E002', 'Last Name cannot be Null')
END
-- more field validations.....
-- Finished field validations

IF @xout IS NOT NULL RETURN
-- continue processing.....
-- A Try/Catch around other statements could also append an error.
RETURN

For convenience as well as modularity, I created a small T-SQL function to insert one error record into the XML error document. This will get enhanced in the future, for example returning the error message in the users language, or optionally writing it to a log file.

/************************************************************************************
file is xerr.sql Copyright (c) 2006 Jack A. Hummer
*/
CREATE FUNCTION [dbo].[xerr]
/*
Purpose
 Example function demonstrating XML error arguments.
 Add one error record to the XML error structure.
Input */
(@xmsg XML, -- current error msg(s), or Null
@err_code
NVARCHAR(20), -- new error code
@err_msg
NVARCHAR(1000) -- new error message
)
-- Output
RETURNS XML /* updated document
 <err>

<error><err_code>nnn</err_code><err_msg>xxxxxxx</err_msg></error>
 </err>
Notes
 
Revision History
14aug06 jah created
************************************************************************************/
AS
BEGIN
 
-- Initialize the document if needed
IF @xmsg IS NULL SET @xmsg = '<err></err>'

-- Insert this error sequentially after any others
SET @xmsg.modify(' 
insert (
 <error>

<err_code>{sql:variable("@err_code")}</err_code>,

<err_msg>{sql:variable("@err_msg")}</err_msg>
 </error>
 )
as last into (/err)[1] ')

-- Return the result of the function
RETURN @xmsg
END

A few notes on this code and related programming issues:

  • The XML data type acts very much like an object, a new twist for T-SQL. It is also very flexible, as it is valid as a variable, argument, or function return. However, from ASP, ADO did not like it as an output variable type.
  • Many of the new methods/arguments require lower case only, an abrupt change from most vendors SQL syntax which has usually been case insensitive. Not my preference but, hey, they forgot to ask me.
  • The .value method utilizes XQuery syntax. If you have not seen this before, it is a complex language used to find things inside an XML document. Don't try to learn it from BOL, look for an introductory tutorial on the web (but, most of the introductory material I found progressed very quickly into advanced language features without completely explaining the fundamentals).
  • You will need to experiment a bit to see what happens when you search for element values or attribute values and they do or do not exist.
  • The .modify method uses both standard XQuery plus new extensions that Microsoft had to invent. The standards people have not yet addressed XML update syntax. Re-read the BOL articles a few times until it starts to make sense. In particular, the first argument must litterally be a string litteral, which seems rather limiting at first, unless you manage to stumble across the "sql:variable()" work-around.
  • Contrary to popular belief, apostrophes (a.k.a. single quote) may optionally be used around XML attribute values.
  • The .value method returns the reserved entities back into their original characters.
  • Within T-SQL, CAST or CONVERT an XML variable to a character string type leaves the special entities intact.
  • The .value method will right-trim the return string according to the data type specification, with no error.

Special Note: when putting together these ideas into some real production code, I found the procedure return argument as an XML data type did not work coming back through ADO to vbScript, so I changed it to an NVARCHAR. This worked just fine when loading it into an MSXML object. The .Net infrastructure may provide enhanced support for passing XML.

Stored Procedure - XML Code Loop

So you may ask, How do I exactly process an unknown number of records in the input XML structure? Given the restrictive syntax on the built-in XML methods, the only solution I have come up with is sp_executesql. You will customize this for your situation, but here is a small example:

SET @n = 1
SET @params = N'@x XML, @value INT OUTPUT'
WHILE 1 = 1
 BEGIN
  SET @cmd = N'SET @value = @x.value(''(/arg/cat)[' +    CAST(@n AS NVARCHAR(2)) + ']'', ''INT'')'
  EXECUTE sp_executesql @cmd, @params, @x = @xarg, @value = @id_cat OUTPUT 

  IF @id_cat IS NULL
   BREAK
   -- do some processing
 SET @n = @n + 1
END -- while

Example: ASP Calls the Stored Procedure [vbscript]

Dim xin, xout,id_customer, conn, sqlcmd
xin = "<arg><fname>Mary</fname></arg>"
 
Set conn = Server.CreateObject("ADODB.Connection")
Set sqlcmd = Server.CreateObject("ADODB.Command")
conn.Open dsn
sqlcmd.ActiveConnection = conn
sqlcmd.CommandText = "dbo.xtest"
sqlcmd.CommandType = adCmdStoredProc
sqlcmd.Parameters.Append
sqlcmd.CreateParameter _ 
 ("@xin", adVarChar, adParamInput, 1000, xin)
sqlcmd.Parameters.Append
sqlcmd.CreateParameter _
 ("@id_customer", adInteger, adParamOutput)
sqlcmd.Parameters.Append
sqlcmd.CreateParameter _
 ("@xout", adVarWChar, adParamOutput, 2000)
sqlcmd.Prepared = True
sqlcmd.CommandTimeout = 40
sqlcmd.Execute , , adExecuteNoRecords
xout = sqlcmd.Parameters("@xout")

If Not IsNull(xout) Then
 Dim oDoc, loadOk, oNodes, oNode, errCode, errMsg
 Set oDoc = Server.CreateObject("MSXML2.DOMDocument.4.0")
 oDoc.async = False
 loadOk = oDoc.loadXML(xout)

 Set oNodes = oDoc.selectNodes("/err/error")

 For Each oNode In oNodes
 Response.Write oNode.nodeType & "<br>"
 errCode = oNode.selectSingleNode("err_code").nodeTypedValue
 errMsg = oNode.selectSingleNode("err_msg").nodeTypedValue

Response.Write("<tr><td>Error Code: " & errCode & "</td><td>" &_

 errMsg & "</td></tr>" & vbCrLf)
 Next

 Set oNode = Nothing
 Set oNodes = Nothing
 Set oDoc = Nothing 
End If

The above is a very simple example, but demonstrates everything we have discussed. The error display to the web page could be wrapped in a small function, and as an alternative could be handled nicely with some XSLT code as well.

What About Schemas?

Yes, for added data validation you could add schema definitions. But for something as transient as argument passing, it seems like too much extra overhead. But for some very critical processing it may be appropriate.

Speaking of Overhead

The extra overhead of converting native data types into XML and back, as well as the overhead of instantiating the MSXML object seems to be the only noticable downside to this approach. For me, the added capability of passing variable amounts of data plus any reduction in long term maintenance makes it a winner. But probably no worse than the creation of a recordset to return structured data. Certainly many if not most procedure calls will not need complex variable input arguments, and the XML output processing is only invoked if there is an error, presumably only a small per centage of the time.

In Conclusion

MS SQL Server T-SQL does not provide a robust interface for passing in varying numbers of arguments or any kind of record structure. On output we may also desire a logical multiple record structure, separate from a data recordset. And the more arguments a procedure takes, the bigger the maintenance chore.

An XML input argument provides a simple way to send from application code a variable number of arguments, including arbitrarily complex logical record structures. An XML output argument likewise is a good container for multiple return values, again including optional record structures. Since the stored procedure just ignores any input elements it does not need, and can often provide defaults for elements not passed, you have a greater chance of being able to make changes to the stored procedure and/or underlying database structure and have relatively little or no impact on existing programs that use it.

References:

These all reference SQL 2000, and mostly use the XML input to just do a direct bulk Insert without preliminary data validations.

http://www.devx.com/dotnet/Article/16155/0/page/2
http://www.sql-server-performance.com/jb_openxml.asp
http://msdn.microsoft.com/msdnmag/issues/05/06/DataPoints/
http://www.devx.com/dotnet/Article/16032/1954?pf=true

A Microsoft article on XML functionality overview in SQL Server, emphasis on SQL 2005.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql2k5xmloptions.asp

Introduction to XQuery in SQL Server 2005

http://msdn.microsoft.com/SQL/default.aspx?pull=/library/en-us/dnsql90/html/sql2k5_xqueryintro.asp

Total article views: 9354 | Views in the last 30 days: 5
 
Related Articles
FORUM

Stored procedure error handler

Stored procedure error handler

FORUM

VB run-time error : Procedure or function spSaveCasteCerti has too many arguments specified.

VB run-time error : Procedure or function spSaveCast has too many argument specified

FORUM

Error in Stored Procedure

Error in COALESCE stored Procedure

ARTICLE

Avoiding Stored Procedures

This week Steve Jones argues against stored procedures. Is it a good argument or do want to stick wi...

FORUM

Nested Stored procedure

Error In nested Stored Procedure

Tags
miscellaneous    
xml    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones