Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Importing XML Files Into SQL Server

By Steve Moore, (first published: 2006/02/15)

Introduction

So, for whatever reason you are fed up with your marriage to XML and you want to go back to storing your data in (gasp!) a database. This article explains one approach for doing just that. In the example, we are using Windows Scripting Host to iterate through a few thousand XML files that will be imported into a SQL Server table. There is an added level of complexity in that the content of the XML files is formatted in HTML.

The Tools

  • SQLXML 3.0, which can be downloaded here
  • ActivePerl (optional), which can be downloaded here

The Situation

You are in the situation of working with thousands of XML files used to show FAQs on a web site.  The structure of the XML files are as such:

<?xml version="1.0" encoding="utf-8" ?>
<FAQ ID="020001" LanguageCode="EN" Priority="1">
<Question><!-- How can I solve my problem? --></Question>
<Summary><!-- Run the hardware utility. --></Summary>
<Content>
    <!--
    <div id="contentcenter">
    <p>Perform a <b>hardware</b> check with the utility to assure the quality of the system driver, etc., etc.</p>
    </div> 
    -->
</Content>
</FAQ>

If your content is not stored as HTML or you already have it enclosed in CDATA tags, then you can skip directly to the section titled "Importing the XML Files."

In order to use SQLXML to import the data, you need to get rid of the comment tags.  There are probably ways to create an XML Schema that will ignore the comment tags, but I chose instead to find and replace the comment tags with CDATA tags.  CDATA is used to tell the XML parser to ignore anything between the tags.  For reasons that I won't get into, I chose to use Perl to handle the text manipulation.

(NOTE: I know many developers would suggest using the REPLACE function, or .NET XmlTextReader with an INSERT statement to import to the database.  As I mentioned, this is just ONE way to import the data.  An example from a guy using VBScript is here.)

Open Notepad and save the code below as a file called Replace.pl.  After installing ActivePerl, you open the file from a command prompt and run it.  You will be prompted for what you want to replace. The find and replace process is twofold:  replacing the beginning comment (<!--) with the beginning CDATA tag (<![CDATA[) and the end comment (-->) with the ending CDATA tag (]]>).

#!/usr/bin/perl
#
# NOTE: the find-and-replace IS case-sensitive.

$dir = $ARGV[0] || '.';
$!=1;
chdir($dir) or die $!;
opendir(DIR,$dir) or die $!;
my $changes=0;
print "\n :: find :. ";
$find=<STDIN>; chop($find);
print " :: replace :. ";
$replace=<STDIN>; chop($replace);
$/=undef;
foreach my $file (readdir(DIR))
{
if($file=~/^.+\.xml$/) # adjust pattern to your needs
{
print "parsing file $file..\n";
open(FILE,"<$file") or die $!;
my $file_data=<FILE>;
close(FILE);
$changes += $file_data =~ s/$find/$replace/gme;
open(FILE,">$file") or die $!;
print FILE $file_data;
close(FILE);
}
}
print "Voila. ($changes changes)\n";
		

The data are now ready to be imported.

Importing the XML Files

You will need to create a table that has the same format as the XML schema used to import the data.  The XML Schema is below; the table name is referenced as sql:relation="<tablename>".

The following is saved as a file called FAQschema.xml.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
    <xsd:element name="FAQ" sql:relation="FAQ" >
        <xsd:complexType>
            <xsd:sequence>
            <xsd:element name="Question" type="xsd:string" sql:use-cdata="1" />
            <xsd:element name="Summary" type="xsd:string" sql:use-cdata="1" />
            <xsd:element name="Content" type="xsd:string" sql:use-cdata="1" />
            </xsd:sequence>
            <xsd:attribute name="ID" type="xsd:string" />
            <xsd:attribute name="LanguageCode" type="xsd:string" />
        </xsd:complexType>
    </xsd:element>
</xsd:schema>

Once the table and the schema are ready, you can use the SQLXMLBulkLoad object and the File System Object to iterate through each file and insert the data into the table.  The following is a VBScript file used to do the import and write success/errors to a file. Save the file with a .vbs extension and run from a command prompt in the same directory as the XML files.

set objBL = CreateObject("SQLXMLBulkLoad.SQLXMLBulkload.3.0")

objBL.ConnectionString = "provider=SQLOLEDB;datasource=localhost;database=TEST;integrated security=SSPI"

objBL.ErrorLogFile = "C:\FAQImport\error.log"

' Here is the path to your XML files

Const path = "C:\FAQImport\FAQs\"

Dim Text, Title, oFile
Dim fso, oFolder, oFiles, wsh ' Object variables

Text = "Folder "
Title = "XML Files"

Set wsh = WScript.CreateObject("WScript.Shell")
' Create FileSystemObject object to access the file system.
Set fso = CreateObject("Scripting.FileSystemObject")

' Get Folder object.
Set oFolder = fso.GetFolder(wsh.ExpandEnvironmentStrings(path)) 
' Get All Files
Set oFiles = oFolder.Files

For Each oFile In oFiles
	If oFile.Type = "XML Document" Then 
		Text = Text & oFolder & vbCrLf & vbCrLf
		Text = Text & "Name" & vbTab & vbTab & "Size" & vbCrLf
		Text = Text & oFile.Name & vbTab
		Text = Text & FormatNumber(oFile.Size, 0) & vbCrLf

		objBL.Execute "C:\FAQImport\FAQschema.xml", oFile.Name
	End If
Next

Dim IFile

Set IFile = fso.CreateTextFile("C:\FAQImport\results.log", True)
IFile.WriteLine(Text)
IFile.Close

MsgBox "FAQ import successful!", vbOKOnly + vbInformation, Title

Set objBL = Nothing
Set fso = Nothing
			

Conclusions

You can usually find many examples about how to consume XML data in SQL Server or inversely how to import/export a single XML file.  This article looks at how to import multiple XML files. Personally, I couldn't find much on this particular topic so I decided to write this up. I hope you find it useful!

Download the code

Credits

1) VBScript: The basis of the file/folder iteration come from a script developed by Günter Born.

2) XSD and SQLXML examples are derived from the Help files in the SQLXML download.

3) Perl Script: This is something I got off of CPAN a few years back and I've found it to be invaluable.  If you know the source PLEASE contact me so I can give him or her proper credit.

Total article views: 37481 | Views in the last 30 days: 66
 
Related Articles
FORUM

Replacing Schema with new schema of Database objects

Replace "Emp" schema with "Employee" Schema

FORUM

importing Oracle8i schema into mssql server 2000

importing Oracle8i schema into mssql server 2000

FORUM

Retreive Symbols in CDATA section TSQL

Retreive and Preserve CDATA section of XML document

FORUM

Should XML Parser apply XSD Schema restrictions to CDATA?

Hi, I'm importing some XML files into SQL which have an XSD schema which I am validating against....

FORUM

XML Destination - CDATA

Trying to create CDATA using T-SQL.

Tags
miscellaneous    
xml    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones