Using OpenXML

,

Introduction

OPENXML provides an easy way to use an XML document as a data-source for your procedures. OPENXML allows the data in XML document to be treated just like the columns and rows of your database table i.e., xml tags as columns and the value as rows. Data can be inserted / updated very quickly and efficiently without multiple trips to the database.

For example, if 100 records are to inserted / updated, then the traditional SQL

method is using 100 insert / update statements. This means that 100 trips are

made to the database, which results in degradation of performance. Using XML, these 100 trips can be reduced to 1 trip. This increases the performance of your application.

In this article, we would look at the basic syntax of OpenXML and end with a

simple example.

Basic Syntax

OPENXML (idoc int [in],rowpattern nvarchar[in],[flags byte[in]]) [WITH

SchemaDeclaration | TableName)]

    Arguments

          Idoc is the document handle of

the internal representation of an XML document.

              

This handle is obtained by calling the system stored procedure

sp_xml_preparedocument, which is 
               discussed later.

         Rowpattern is the XPath query used to identify the nodes to be processed as

rows.

        Flags indicates the mapping between the XML data and the relational rowset.

(optional parameter)

           

0 - Defaults to attribute-centric mapping.

           

1 - Attribute-centric mapping. (Combined with XML_ELEMENTS)

           

2 - Element-centric mapping. (Combined with XML_ATTRIBUTES)

           

8 - Combined with XML_ATTRIBUTES or XML_ELEMENTS

       

  SchemaDeclaration

is

the schema definition of the form:

    ColName ColType [ColPattern |

MetaProperty][, ColName ColType [ColPattern | MetaProperty]...]

   

Eg: WITH (EMPLOYEENAME

VARCHAR(30), EMPLOYEESALARY

INTEGER)

         

TableName is

the table name that can be given, instead of Schema Declaration, if a

table exists.

The WITH clause

provides a table format using either SchemaDeclaration or specifying an

existing TableName. If the optional WITH clause is not specified, the

results are returned in an edge table format. Edge tables represent the

fine-grained XML document structure (e.g. element/attribute names, the document

hierarchy, the namespaces, PIs etc.) in a single table.

System Stored Procedures for OpenXML

SQL Server provides system stored procedures that are used in conjunction with OPENXML:

  • sp_xml_preparedocument
  • sp_xml_removedocument

To write queries using

OPENXML, you must first create an internal representation of the XML document by

calling sp_xml_preparedocument. It is similar to LoadXML() function

provided in System.XML namespace. The stored procedure returns a handle to the

internal representation of the XML document. This handle is then passed to

OPENXML, which provides tabular view of the document based on Xpath query.

The internal representation

of an XML document can be removed from memory by calling

sp_xml_removedocument system stored procedure.

About

sp_xml_preparedocument

Steps which occurs when this

procedure is executed:

  1. Reads the XML text provided as input.
  2. Parses the text using the XML parser.
  3. Provides the parsed document, which is in tree form containing various nodes (elements, attributes, text, comments, and so on) in the XML document.
  4. It returns a handle that can be used to access the newly created internal representation of the XML document. This handle is valid for until the connection is reset, or until the execution of sp_xml_removedocument.

Note: A parsed document

is stored in the internal cache of SQL Server 2000. The MSXML parser uses

one-eighth the total memory available for SQL Server. To avoid running out of

memory, run sp_xml_removedocument to free up the memory.

The syntax is:

          sp_xml_preparedocument hdoc OUTPUT [,

xmltext] [, xpath_namespaces]

hdoc

is the handle to the newly created document.(Integer value)

[xmltext]  is the original XML document. The default

value is NULL, in which case an internal representation of an empty XML document

is created.

[xpath_namespaces]  Specifies the namespace

declarations that are used in row and column XPath expressions in OPENXML.

Returns:

0 (success) or >0 (failure)

 

About

sp_xml_removedocument

Removes the internal

representation of the XML document specified by the document handle.

The syntax is:

sp_xml_removedocument hdoc

hdoc

is the handle to the

newly created document.(Integer value)

           

Returns: 0

(success) or >0 (failure)

Basic structure of stored-procedure(using OPENXML)

CREATE PROCEDURE dbo.TestOpenXML
(
	@strXML VARCHAR(2000)
)
AS
DECLARE @XMLDocPointer INT
EXEC sp_xml_preparedocument @XMLDocPointer OUTPUT, @strXML
BEGIN TRANSACTION
    ------ Any DML can be used here -------(see examples)
COMMIT
EXEC sp_xml_removedocument @XMLDocPointer
RETURN

Examples

Consider two tables: Employees & Departments.

    Employees: EmployeeID, EmployeeName, EmployeeSalary, DeptID.

    Departments: DeptID, DeptName.

    (EmployeeID and DeptID are Identity fileds.)

The XML file will follow the structure.

(XML Type 1)

<RECORD>

    <EMPLOYEE>

        <EMPLOYEENAME>Test Name</EMPLOYEENAME>

        <EMPLOYEESALARY>1000</EMPLOYEESALARY>

        <DEPTID>10</DEPTID>

    </EMPLOYEE>

    .

    .

</RECORD>

If 100 employees have been recruited, u can generate a XML

file having the following structure with 100 <EMPLOYEE> tags, the query should

be:

INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT EMPLOYEENAME, EMPLOYEESALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (EMPLOYEENAME VARCHAR(30), EMPLOYEESALARY INTEGER, DEPTID INTEGER)

If 100 employees are transferred from Production Department to Stores Department, the query should be:

UPDATE Employees
SET DeptID = xmlTable.DEPTID 
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (EMPLOYEEID INTEGER, DEPTID INTEGER) XmlTable 
WHERE XmlTable.EMPLOYEEID = Employees.EmployeeID

   

Note: use of alias for XML document is required to avoid confusion.

If the XML file contains attributes, there is minor change to the syntax.

(XML Type 2)

<RECORD>

<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>

<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>

<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>

.

.

</RECORD>

INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT NAME, SALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (NAME VARCHAR(30) ‘@NAME’,
SALARY INT ‘@SALARY’,
DEPTID INT '@DEPTID')

Another scenario for attributes (accessing the parent node)
(XML Type 3)

<RECORD>

    <DEPT ID=’10’>

        <EMPLOYEE NAME = ‘Test101’ SALARY = ‘10000’/>

        <EMPLOYEE NAME = ‘Test102’ SALARY = ‘10000’/>

    </DEPT>

    <DEPT ID=’11’>

        <EMPLOYEE NAME = ‘Test111’ SALARY = ‘10000’/>

        <EMPLOYEE NAME = ‘Test112’ SALARY = ‘10000’/>

        <EMPLOYEE NAME = ‘Test113’ SALARY = ‘10000’/>

        <EMPLOYEE NAME = ‘Test114’ SALARY = ‘10000’/>

    </DEPT>
    .

    .
</RECORD>

INSERT INTO Employees (EmployeeName, EmployeeSalary, DeptID)
SELECT NAME, SALARY, DEPTID 
FROM OPENXML (@XMLDocPointer,'/RECORD/DEPT/EMPLOYEE',2)
WITH (NAME VARCHAR(30) ‘@NAME’, 
SALARY INT ‘@SALARY’, 
DEPTID INT '../@ID')

Finally, your store procedure should look like (using XML Type 2):

CREATE PROCEDURE dbo.TestOpenXML
( @strXML VARCHAR(2000)
)
AS
DECLARE @XMLDocPointer INT
EXEC sp_xml_preparedocument @XMLDocPointer OUTPUT, @strXML

BEGIN TRANSACTION

INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT NAME, SALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (NAME VARCHAR(30) ‘@NAME’, SALARY INT ‘@SALARY’, DEPTID INT '@DEPTID')
COMMIT

EXEC sp_xml_removedocument @XMLDocPointer
RETURN

Conclusion

With SQL Server 2000, there are many ways you can get XML representation of relational data. One simple way is using the FOR XML clause with SELECT statement. A reverse mechanism, which allows turning XML document into tabular, relational rowset format is provided by T-SQL keyword named OPENXML.

The steps for using OPENXML can be simplified as:

  1. Call sp_xml_preparedocument. (system stored procedure which loads XML document string into memory, parses it and returns a handle).
  2. Use XPath querries for extracting required information from the XML through the handle.
  3. Call sp_xml_removedocument. (system stored procedure which frees up the memory allocated to the internal representation of the XML document).

In this article, we studied the OPENXML syntax and options available with it, with simpe examples. Future articles we will discuss other XML features offered by SQL Server 2000.

Rate

4.5 (2)

Share

Share

Rate

4.5 (2)