One of the many useful features gained by SQL Server 2000 is the ability to handle XML documents.
This means that any computer language capable of opening an XML files and calling SQL Server stored procedures can make use of this new ability.
SQL Server can use any of the data types used to hold character information.
TEXT and NTEXT cannot be used as local variables and can only be passed as arguments to a stored procedure.
Bare in mind that the Unicode data types consume twice as much memory as the ASCII equivalents.
|ASCII Type||Unicode Type||Max Size||Comment|
|VARCHAR||NVARCHAR||8000 characters||Variable length|
|TEXT||NTEXT||2Gb||May require you to use SET TEXTSIZE as the @@TEXTSIZE parameter determines how much data is returned from a TEXT/NTEXT field by a SELECT statement.|
Parses the XML document using MSXML 2.0.
WARNING: - This will use 1/8th of the memory available to SQL Server.
If the procedure is successful it outputs a handle to the XML document as stored in the SQL Server cache.
Removes the XML document from the cache and destroys the handle.
Returns the rowset from the prepared XML document.
In the same way that any C++ object created should always be explicitly destroyed and any pointer to a VB object should always have the equivalent SET <obj>=Nothing so you should never use sp_xml_preparedocument in a procedure without using sp_xml_removedocument.
The example document contains an XML schema used within a VB program to draw a form for maintaining web page metadata.
The XML document is written out as a file by a content management system (CMS) in order to allow off-line editing.
The off-line editing process outputs the amended XML file that can then be read and the salient portions written back out into the CMS.
The most important point to note is that as OPENXML is querying an XML document then regardless of the settings within SQL Server the query is case sensitive.
Replacing Page/AuthorsForm/MetaDataItem with page/authorsform/metadataitem will not work. Neither will changing the case of the field names within the WITH statement.
Let us suppose that you have a stored procedure defined as follows.
1. We declare the variable @iDocumentHandle to hold the handle for our XML document.
2. We tell SQL Server to prepare our document.
3. Our OPENXML statement accepts three arguments.
Ø @iDocumentHandle containing our handle to our document.
Ø The XPATH to our metadata items.
Ø An argument to say whether or not we are interested in Attributes or Elements.
4. Our WITH clause says which elements and/or attributes we are interested in.
5. We tell SQL Server to remove our document from the cache and destroy the pointer.
Just for the sake of argument, to call our stored procedure we will have done something like the following
In this case we would get the results set as follows
|Title||Prisoners: getting further help||100|
|Description||Prisoners: getting further help||200|
The OPENXML statement allows all of its arguments to be passed as parameters.
This means that the xpath and flags themselves can be passed as parameters. This gives you considerable flexibility however as the WITH statement cannot accept parameters this means that you will probably have to accept output as an EDGE table (see Books Online).
As the querying of the document is not handled by SQL Server but via MS XML the xpath argument can be any valid xpath supported by the MS XML API. Remember that this is MSXML2.
Page/AuthorsForm/MetaDataItem[*="null"] will return any metadata item where any child element has a value of null.
Page/AuthorsForm/MetaDataItem[Name="Title"] will return any metadata item where the Name element is set to Title.
In our example XML document Name and Value are elements where as MaxSize is an attribute.
The 3rd parameter of OPENXML tells SQL Server what we are interested in.
The parameter is a bit flag so we can combine flags to tell SQL Server to return both, which is what we have done in the preceding example.
The flag settings are as follows.
|0, 1 or omitted||Return only the attributes.|
|2||Return only the elements.|
|3 (combination of 1 and 2)||Return both attributes and elements|
This has the following affect on our results.
Flags set to 0, 1 or omitted
Here we can see that only the MaxSize attribute has been returned.
Flags set to 2
Here we can see that the Name and Value elements have been returned but the MaxSize element has been ignored.
|Title||Prisoners: getting further help||NULL|
|Description||Prisoners: getting further help||NULL|
The WITH statement can be omitted, in which case you will receive what is known as an EDGE table (see books online).
Where it is used the WITH statement can contain field definitions or the name of a table containing the field definitions.
As mentioned earlier the names of the fields must match the names of the elements and are case sensitive.
If you want the fieldnames retrieved to be something other than those in the XML document then the field definitions themselves have an xpath portion.
For example, let us suppose that we want to return the MaxSize attribute as TextBoxSize, then we would amend our WITH statement as follows.
Within the WITH clause the xpath portion of the field definition also allows you to reference the ancestors or parents of the elements you are referencing.
The example given in books online shows a hierarchy of
o Order Headers
§ Order Details
We would query the Order Details line but our xpath arguments would allow us to retrieve information from the Order Header and Customer level of the document.
As a live example let us look at an XML document used to update content within a CMS system.
A <SlotBlock> is a paragraph.
<SlotBlockStyle> is the paragraph style.
<SlotBlockContent> identifies a piece of text within the paragraph.
We want to query the elements within the SlotBlockContent elements so we would write the following.
This would return the following data
|This is my heading||B|
|This is my paragraph|
|This is another paragraph||T|
If we want to grab the <SlotBlockStyle> element from the level above then we would have to add the xpath for the element as shown by the code as marked in bold below.
This would return the following data
|heading3||This is my heading||B|
|Paragraph||This is my paragraph|
|Paragraph||This is another paragraph||T|
The "../" in the xpath definition for ParagraphStyle tells SQL Server to jump up a level to retrieve SlotBlockStyle.
If we added "../../" this would mean jump up two levels to read a specified element.
Finally, let us suppose that there were other elements within <SlotBlockContent> that we didn't know anything about.
We could return information on this using an overflow directive.
The @mp:xmltext field is a special field that will return the parent element and any child elements that have not been explicitly declared in the WITH statement.
If there are no additional child elements then @mp:xmltext will simply return an empty parent element, in this case <SlotBlockContent/>