SQLServerCentral Article

Querying XML files using SQL 2000

,

Introduction

One of the many useful features gained by SQL Server 2000 is the ability to handle XML documents.

This means that any computer language capable of opening an XML files and calling SQL Server stored procedures can make use of this new ability.

Data types used

SQL Server can use any of the data types used to hold character information.

TEXT and NTEXT cannot be used as local variables and can only be passed as arguments to a stored procedure.

Bare in mind that the Unicode data types consume twice as much memory as the ASCII equivalents.

ASCII Type Unicode Type Max Size Comment
CHAR NCHAR 8000 Fixed length
VARCHAR NVARCHAR 8000 characters Variable length
TEXT NTEXT 2Gb May require you to use SET TEXTSIZE as the @@TEXTSIZE parameter determines how much data is returned from a TEXT/NTEXT field by a SELECT statement.

SQL Commands Used.

SQL Command Comment
sp_xml_preparedocument

Parses the XML document using MSXML 2.0.

WARNING: - This will use 1/8th of the memory available to SQL Server.

If the procedure is successful it outputs a handle to the XML document as stored in the SQL Server cache.

sp_xml_removedocument

Removes the XML document from the cache and destroys the handle.

OPENXML

Returns the rowset from the prepared XML document.

In the same way that any C++ object created should always be explicitly destroyed and any pointer to a VB object should always have the equivalent SET <obj>=Nothing so you should never use sp_xml_preparedocument in a procedure without using sp_xml_removedocument.


Example document

The example document contains an XML schema used within a VB program to draw a form for maintaining web page metadata.

The XML document is written out as a file by a content management system (CMS) in order to allow off-line editing.

The off-line editing process outputs the amended XML file that can then be read and the salient portions written back out into the CMS.

<?xml version="1.0"?>

<Page>

                <AuthorsForm>

                                <MetaDataItem MaxSize="100">

                                                <Name>Class Number</Name>

                                                <Value><![CDATA[null]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>False</Editable>

                                                <ToolTip>Please enter a Class Number value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="100">

                                                <Name>Title</Name>

                                                <Value><![CDATA[Prisoners: getting further help]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>True</Editable>

                                                <ToolTip>Please enter a Title value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="50">

                                                <Name>Author</Name>

                                                <Value><![CDATA[David Poole]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>False</Editable>

                                                <ToolTip>Please enter a Author value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="50">

                                                <Name>Translator</Name>

                                                <Value><![CDATA[Renee Arnoux]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>True</Editable>

                                                <ToolTip>Please enter a Translator value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="100">

                                                <Name>Subject</Name>

                                                <Value><![CDATA[Civil Rights]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>False</Editable>

                                                <ToolTip>Please enter a Subject value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="200">

                                                <Name>Description</Name>

                                                <Value><![CDATA[Prisoners: getting further help]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>False</Editable>

                                                <ToolTip>Please enter a Description value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="100">

                                                <Name>Publisher</Name>

                                                <Value><![CDATA[HMSO]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>False</Editable>

                                                <ToolTip>Please enter a Publisher value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="10">

                                                <Name>Date</Name>

                                                <Value><![CDATA[2002-1-09]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>False</Editable>

                                                <ToolTip>Please enter a Date value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="150">

                                                <Name>Resource Type</Name>

                                                <Value><![CDATA[Document]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>combobox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>False</Editable>

                                                <ToolTip>Please enter a Resource Type value.</ToolTip>

                                </MetaDataItem>

                                <MetaDataItem MaxSize="100">

                                                <Name>Resource Identifier</Name>

                                                <Value><![CDATA[prisoners_getting_further_help]]></Value>

                                                <Type>text</Type>

                                                <CTRLType>textbox</CTRLType>

                                                <CTRLFont>MS Sans Serif</CTRLFont>

                                                <Mandatory>True</Mandatory>

                                                <Editable>False</Editable>

                                                <ToolTip>Please enter a Resource Identifier value.</ToolTip>

                                </MetaDataItem>

                </AuthorsForm>

</Page>


Querying the document

Case sensitivity

The most important point to note is that as OPENXML is querying an XML document then regardless of the settings within SQL Server the query is case sensitive.

Replacing Page/AuthorsForm/MetaDataItem with page/authorsform/metadataitem will not work. Neither will changing the case of the field names within the WITH statement.

Using a stored procedure

Let us suppose that you have a stored procedure defined as follows.

CREATE PROC usp_ParseXML @stXML NTEXT AS

                DECLARE @iDocumentHandle Int

exec sp_xml_preparedocument @iDocumentHandle OUTPUT, @stdoc

 

SELECT *

FROM      OPENXML (@iDocumentHandle , 'Page/AuthorsForm/MetaDataItem',3)

WITH (Name VARCHAR(50) , Value VARCHAR(50), MaxSize CHAR(3) )

 

exec sp_xml_removedocument @iDocumentHandle

1.      We declare the variable @iDocumentHandle to hold the handle for our XML document.

2.      We tell SQL Server to prepare our document.

3.      Our OPENXML statement accepts three arguments.

Ø      @iDocumentHandle containing our handle to our document.

Ø      The XPATH to our metadata items.

Ø      An argument to say whether or not we are interested in Attributes or Elements.

4.      Our WITH clause says which elements and/or attributes we are interested in.

5.      We tell SQL Server to remove our document from the cache and destroy the pointer.

Just for the sake of argument, to call our stored procedure we will have done something like the following

DECLARE @stXML VARCHAR(8000)

SET @stDoc=

'<?xml version="1.0"?>

<Page> …..etc

exec usp_ParseXML @stDoc


In this case we would get the results set as follows

Name Value MaxSize
Class Number null 100
Title Prisoners: getting further help 100
Author David Poole 50
Translator Renee Arnoux 50
Subject Civil Rights 100
Description Prisoners: getting further help 200
Publisher HMSO 100
Date 2002-1-09 10
Resource Type Document 150
Resource Identifier prisoners_getting_further_help 100

Arguments of the OpenXML clause

The OPENXML statement allows all of its arguments to be passed as parameters.

This means that the xpath and flags themselves can be passed as parameters.  This gives you considerable flexibility however as the WITH statement cannot accept parameters this means that you will probably have to accept output as an EDGE table (see Books Online).

xpath

As the querying of the document is not handled by SQL Server but via MS XML the xpath argument can be any valid xpath supported by the MS XML API. Remember that this is MSXML2.

Page/AuthorsForm/MetaDataItem[*="null"] will return any metadata item where any child element has a value of null.

Page/AuthorsForm/MetaDataItem[Name="Title"] will return any metadata item where the Name element is set to Title.

Flags

In our example XML document Name and Value are elements where as MaxSize is an attribute.

The 3rd parameter of OPENXML tells SQL Server what we are interested in.

The parameter is a bit flag so we can combine flags to tell SQL Server to return both, which is what we have done in the preceding example.

The flag settings are as follows.

Value Meaning
0, 1 or omitted Return only the attributes.
2 Return only the elements.
3 (combination of 1 and 2) Return both attributes and elements

This has the following affect on our results.

Flags set to 0, 1 or omitted

Here we can see that only the MaxSize attribute has been returned.

Name Value MaxSize
NULL NULL 100
NULL NULL 100
NULL NULL 50
NULL NULL

50

NULL NULL

100

NULL NULL 200
NULL NULL 100
NULL NULL 10
NULL NULL 150
NULL NULL 100

Flags set to 2

Here we can see that the Name and Value elements have been returned but the MaxSize element has been ignored.

Name Value MaxSize
Class Number NULL NULL
Title Prisoners: getting further help NULL
Author David Poole NULL
Translator Renee Arnoux NULL
Subject Civil Rights NULL
Description Prisoners: getting further help NULL
Publisher HMSO NULL
Date 2002-1-09 NULL
Resource Type Document NULL
Resource Identifier prisoners_getting_further_help NULL

The WITH statement

The WITH statement can be omitted, in which case you will receive what is known as an EDGE table (see books online).

Where it is used the WITH statement can contain field definitions or the name of a table containing the field definitions.

As mentioned earlier the names of the fields must match the names of the elements and are case sensitive.

If you want the fieldnames retrieved to be something other than those in the XML document then the field definitions themselves have an xpath portion.

For example, let us suppose that we want to return the MaxSize attribute as TextBoxSize, then we would amend our WITH statement as follows.

WITH(Name VARCHAR(50) , Value VARCHAR(50), TextBoxSize CHAR(3) 'MaxSize')

Different levels of the XML schema

Within the WITH clause the xpath portion of the field definition also allows you to reference the ancestors or parents of the elements you are referencing.

The example given in books online shows a hierarchy of

Ø      Customers

o        Order Headers

§         Order Details

We would query the Order Details line but our xpath arguments would allow us to retrieve information from the Order Header and Customer level of the document.

As a live example let us look at an XML document used to update content within a CMS system.

<?xml version="1.0"?>

<Page>

 <SlotBlock>

                <SlotBlockStyle>heading3</SlotBlockStyle>

                <SlotBlockContent>

                                <PreObject>This is my heading</PreObject>

                                <PostObject />

                                <ObjectID>-1</ObjectID>

                                <ObjectType>B</ObjectType>

                </SlotBlockContent>

 </SlotBlock>

 <SlotBlock>

                <SlotBlockStyle>Paragraph</SlotBlockStyle>

                <SlotBlockContent>

                                <PreObject><![CDATA[This is my paragraph]]></PreObject>

                                <PostObject />

                                <ObjectID>0</ObjectID>

                                <ObjectType />

                </SlotBlockContent>

                <SlotBlockContent>

                                <PreObject><![CDATA[This is another paragraph]]></PreObject>

                                <PostObject />

                                <ObjectID>0</ObjectID>

                                <ObjectType>T</ObjectType>

                </SlotBlockContent>

  </SlotBlock>

</Page>

A <SlotBlock> is a paragraph.

<SlotBlockStyle> is the paragraph style.

<SlotBlockContent> identifies a piece of text within the paragraph.

We want to query the elements within the SlotBlockContent elements so we would write the following.

DECLARE @stDoc VARCHAR(8000)

DECLARE @stxPath VARCHAR(50)

DECLARE @iFlags Int

DECLARE @iDocumentHandle Int

SET @iFlags=2

SET @stxPath='Page/SlotBlock/SlotBlockContent'

SET @stDoc=……

 

Our example document

 

exec sp_xml_preparedocument @iDocumentHandle OUTPUT, @stdoc

 

SELECT *

FROM      OPENXML (@iDocumentHandle , @stxpath,@iFlags)

WITH(      PreObject VARCHAR(50) ,

                PostObject VARCHAR(50),

                ObjectType CHAR(1) )

 

exec sp_xml_removedocument @iDocumentHandle

This would return the following data

PreObject PostObject ObjectType
This is my heading B
This is my paragraph
This is another paragraph T

If we want to grab the <SlotBlockStyle> element from the level above then we would have to add the xpath for the element as shown by the code as marked in bold below.

DECLARE @stDoc VARCHAR(8000)

DECLARE @stxPath VARCHAR(50)

DECLARE @iFlags Int

DECLARE @iDocumentHandle Int

SET @iFlags=2

SET @stxPath='Page/SlotBlock/SlotBlockContent'

SET @stDoc=……

 

Our example document

 

exec sp_xml_preparedocument @iDocumentHandle OUTPUT, @stdoc

 

SELECT *

FROM      OPENXML (@iDocumentHandle , @stxpath,@iFlags)

WITH(      ParagraphStyle VARCHAR(50) '../SlotBlockStyle'

                PreObject VARCHAR(50) ,

                PostObject VARCHAR(50),

                ObjectType CHAR(1) )

 

exec sp_xml_removedocument @iDocumentHandle

This would return the following data

ParagraphStyle PreObject PostObject ObjectType
heading3 This is my heading B
Paragraph This is my paragraph
Paragraph This is another paragraph T

The "../" in the xpath definition for ParagraphStyle tells SQL Server to jump up a level to retrieve SlotBlockStyle.

If we added "../../" this would mean jump up two levels to read a specified element.

Additional Information

Finally, let us suppose that there were other elements within <SlotBlockContent> that we didn't know anything about.

We could return information on this using an overflow directive.

WITH(      ParagraphStyle VARCHAR(50) '../SlotBlockStyle'

                PreObject VARCHAR(50) ,

                PostObject VARCHAR(50),

                ObjectType CHAR(1) ,

                Overflow VARCHAR(250) '@mp:xmltext' )

The @mp:xmltext field is a special field that will return the parent element and any child elements that have not been explicitly declared in the WITH statement.

If there are no additional child elements then @mp:xmltext will simply return an empty parent element, in this case <SlotBlockContent/>

Rate

4 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

4 (1)

You rated this post out of 5. Change rating