Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Using OpenXML

By Vasant Raj, (first published: 2005/05/12)

Introduction

OPENXML provides an easy way to use an XML document as a data-source for your procedures. OPENXML allows the data in XML document to be treated just like the columns and rows of your database table i.e., xml tags as columns and the value as rows. Data can be inserted / updated very quickly and efficiently without multiple trips to the database.

For example, if 100 records are to inserted / updated, then the traditional SQL method is using 100 insert / update statements. This means that 100 trips are made to the database, which results in degradation of performance. Using XML, these 100 trips can be reduced to 1 trip. This increases the performance of your application.

In this article, we would look at the basic syntax of OpenXML and end with a simple example.

Basic Syntax

OPENXML (idoc int [in],rowpattern nvarchar[in],[flags byte[in]]) [WITH SchemaDeclaration | TableName)]

    Arguments
          Idoc is the document handle of the internal representation of an XML document.
               This handle is obtained by calling the system stored procedure sp_xml_preparedocument, which is 
               discussed later.

         Rowpattern is the XPath query used to identify the nodes to be processed as rows.

        Flags indicates the mapping between the XML data and the relational rowset. (optional parameter)

            0 - Defaults to attribute-centric mapping.
            1 - Attribute-centric mapping. (Combined with XML_ELEMENTS)
            2 - Element-centric mapping. (Combined with XML_ATTRIBUTES)
            8 - Combined with XML_ATTRIBUTES or XML_ELEMENTS

          SchemaDeclaration is the schema definition of the form:

    ColName ColType [ColPattern | MetaProperty][, ColName ColType [ColPattern | MetaProperty]...]

    Eg: WITH (EMPLOYEENAME VARCHAR(30), EMPLOYEESALARY INTEGER)

          TableName is the table name that can be given, instead of Schema Declaration, if a table exists.

The WITH clause provides a table format using either SchemaDeclaration or specifying an existing TableName. If the optional WITH clause is not specified, the results are returned in an edge table format. Edge tables represent the fine-grained XML document structure (e.g. element/attribute names, the document hierarchy, the namespaces, PIs etc.) in a single table.

System Stored Procedures for OpenXML

SQL Server provides system stored procedures that are used in conjunction with OPENXML:

  • sp_xml_preparedocument
  • sp_xml_removedocument

To write queries using OPENXML, you must first create an internal representation of the XML document by calling sp_xml_preparedocument. It is similar to LoadXML() function provided in System.XML namespace. The stored procedure returns a handle to the internal representation of the XML document. This handle is then passed to OPENXML, which provides tabular view of the document based on Xpath query.

The internal representation of an XML document can be removed from memory by calling sp_xml_removedocument system stored procedure.

About sp_xml_preparedocument

Steps which occurs when this procedure is executed:

  1. Reads the XML text provided as input.
  2. Parses the text using the XML parser.
  3. Provides the parsed document, which is in tree form containing various nodes (elements, attributes, text, comments, and so on) in the XML document.
  4. It returns a handle that can be used to access the newly created internal representation of the XML document. This handle is valid for until the connection is reset, or until the execution of sp_xml_removedocument.

Note: A parsed document is stored in the internal cache of SQL Server 2000. The MSXML parser uses one-eighth the total memory available for SQL Server. To avoid running out of memory, run sp_xml_removedocument to free up the memory.

The syntax is:

          sp_xml_preparedocument hdoc OUTPUT [, xmltext] [, xpath_namespaces]

hdoc is the handle to the newly created document.(Integer value)

[xmltext]  is the original XML document. The default value is NULL, in which case an internal representation of an empty XML document is created.

[xpath_namespaces]  Specifies the namespace declarations that are used in row and column XPath expressions in OPENXML.

Returns: 0 (success) or >0 (failure)

 

About sp_xml_removedocument

Removes the internal representation of the XML document specified by the document handle.

The syntax is: sp_xml_removedocument hdoc

hdoc is the handle to the newly created document.(Integer value)

            Returns: 0 (success) or >0 (failure)

Basic structure of stored-procedure(using OPENXML)

CREATE PROCEDURE dbo.TestOpenXML
(
	@strXML VARCHAR(2000)
)
AS

DECLARE @XMLDocPointer INT

EXEC sp_xml_preparedocument @XMLDocPointer OUTPUT, @strXML

BEGIN TRANSACTION
    ------ Any DML can be used here -------(see examples)
COMMIT

EXEC sp_xml_removedocument @XMLDocPointer

RETURN

Examples

Consider two tables: Employees & Departments.
    Employees: EmployeeID, EmployeeName, EmployeeSalary, DeptID.
    Departments: DeptID, DeptName.
    (EmployeeID and DeptID are Identity fileds.)

The XML file will follow the structure.
(XML Type 1)

<RECORD>
    <EMPLOYEE>
        <EMPLOYEENAME>Test Name</EMPLOYEENAME>
        <EMPLOYEESALARY>1000</EMPLOYEESALARY>
        <DEPTID>10</DEPTID>
    </EMPLOYEE>
    .
    .
</RECORD>

If 100 employees have been recruited, u can generate a XML file having the following structure with 100 <EMPLOYEE> tags, the query should be:
INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT EMPLOYEENAME, EMPLOYEESALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (EMPLOYEENAME VARCHAR(30), EMPLOYEESALARY INTEGER, DEPTID INTEGER)
If 100 employees are transferred from Production Department to Stores Department, the query should be:
UPDATE Employees
SET DeptID = xmlTable.DEPTID 
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (EMPLOYEEID INTEGER, DEPTID INTEGER) XmlTable 
WHERE XmlTable.EMPLOYEEID = Employees.EmployeeID
    Note: use of alias for XML document is required to avoid confusion.

If the XML file contains attributes, there is minor change to the syntax.
(XML Type 2)


<RECORD>
<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>
<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>
<EMPLOYEE DEPTID = ‘10’ NAME = ‘Test’ SALARY = ‘10000’/>
.
.
</RECORD>

INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT NAME, SALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (NAME VARCHAR(30) ‘@NAME’,
SALARY INT ‘@SALARY’,
DEPTID INT '@DEPTID')
Another scenario for attributes (accessing the parent node)
(XML Type 3)

<RECORD>
    <DEPT ID=’10’>
        <EMPLOYEE NAME = ‘Test101’ SALARY = ‘10000’/>
        <EMPLOYEE NAME = ‘Test102’ SALARY = ‘10000’/>
    </DEPT>
    <DEPT ID=’11’>
        <EMPLOYEE NAME = ‘Test111’ SALARY = ‘10000’/>
        <EMPLOYEE NAME = ‘Test112’ SALARY = ‘10000’/>
        <EMPLOYEE NAME = ‘Test113’ SALARY = ‘10000’/>
        <EMPLOYEE NAME = ‘Test114’ SALARY = ‘10000’/>
    </DEPT>
    .
    .
</RECORD>

INSERT INTO Employees (EmployeeName, EmployeeSalary, DeptID)
SELECT NAME, SALARY, DEPTID 
FROM OPENXML (@XMLDocPointer,'/RECORD/DEPT/EMPLOYEE',2)
WITH (NAME VARCHAR(30) ‘@NAME’, 
SALARY INT ‘@SALARY’, 
DEPTID INT '../@ID')

Finally, your store procedure should look like (using XML Type 2):

CREATE PROCEDURE dbo.TestOpenXML
( @strXML VARCHAR(2000)
)
AS
DECLARE @XMLDocPointer INT
EXEC sp_xml_preparedocument @XMLDocPointer OUTPUT, @strXML

BEGIN TRANSACTION

INSERT INTO Employees(EmployeeName, EmployeeSalary, DeptID)
SELECT NAME, SALARY, DEPTID
FROM OPENXML(@XMLDocPointer,'/RECORD/EMPLOYEE',2)
WITH (NAME VARCHAR(30) ‘@NAME’, SALARY INT ‘@SALARY’, DEPTID INT '@DEPTID')
COMMIT

EXEC sp_xml_removedocument @XMLDocPointer
RETURN

Conclusion

With SQL Server 2000, there are many ways you can get XML representation of relational data. One simple way is using the FOR XML clause with SELECT statement. A reverse mechanism, which allows turning XML document into tabular, relational rowset format is provided by T-SQL keyword named OPENXML.

The steps for using OPENXML can be simplified as:

  1. Call sp_xml_preparedocument. (system stored procedure which loads XML document string into memory, parses it and returns a handle).
  2. Use XPath querries for extracting required information from the XML through the handle.
  3. Call sp_xml_removedocument. (system stored procedure which frees up the memory allocated to the internal representation of the XML document).

In this article, we studied the OPENXML syntax and options available with it, with simpe examples. Future articles we will discuss other XML features offered by SQL Server 2000.

Total article views: 38270 | Views in the last 30 days: 41
 
Related Articles
BLOG

Salaried employee vs contractor

Are you a salaried employee looking to switch to a W-2 or 1099 contractor? (For an explanation of W-...

FORUM

OPENXML

Retreiving data from OPENXML

FORUM
BLOG

Consulting company: Perm/Salaried vs 1099/W2

One of my pet peeves is seeing a top-notch BI architect/developer working as a salaried/perm employe...

FORUM

Update salary details without cursors

Hi i want to write the while loop without cursors and update salary of emp if salary is less than 50...

Tags
openxml    
xml    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones