SQLServerCentral Article

Importing XML Files Into SQL Server

,

Introduction

So, for whatever reason you are fed up with your marriage to XML and you want

to go back to storing your data in (gasp!) a database. This article explains

one approach for doing just that. In the example, we are using Windows

Scripting Host to iterate through a few thousand XML files that will be

imported into a SQL Server table. There is an added level of complexity in that

the content of the XML files is formatted in HTML.

The Tools

  • SQLXML 3.0, which can be downloaded

    here

  • ActivePerl (optional), which can be downloaded

    here

The Situation

You are in the situation of working with thousands of XML files used to show FAQs on a web site.  The

structure of the XML files are as such:

<?xml version="1.0" encoding="utf-8" ?>

<FAQ ID="020001" LanguageCode="EN" Priority="1">

<Question><!-- How can I solve my problem? --></Question>

<Summary><!-- Run the hardware utility. --></Summary>

<Content>

    <!--

    <div id="contentcenter">

   

<p>Perform a <b>hardware</b> check with the utility to assure

the quality of the system driver, etc., etc.</p>

   

</div> 

    -->

</Content>

</FAQ>

If your content is not stored as HTML or you already

have it enclosed in CDATA tags, then you can skip directly to the section titled "Importing the XML

Files."

In order to use SQLXML to import the data, you need to get

rid of the comment tags.  There are probably ways to create an XML Schema

that will ignore the comment tags, but I chose instead to find and replace the

comment tags with CDATA tags.  CDATA is used to tell the XML

parser to ignore anything between the tags.  For reasons that I

won't get into, I chose to use Perl to handle the text manipulation.

(NOTE: I know many developers would suggest using the REPLACE function,

or .NET XmlTextReader with an

INSERT statement to import to the database.  As I mentioned, this is just

ONE way to import the data.  An example from a guy using

VBScript is

here.)

Open Notepad and save the code below as a file called

Replace.pl.  After installing

ActivePerl, you open the file from a command

prompt and run it.  You will be prompted for what you want to replace. The

find and replace process is twofold:  replacing the beginning comment

(<!--) with the beginning CDATA tag (<![CDATA[) and the end comment

(-->) with the ending CDATA tag (]]>).

#!/usr/bin/perl
#
# NOTE: the find-and-replace IS case-sensitive.
$dir = $ARGV[0] || '.';
$!=1;
chdir($dir) or die $!;
opendir(DIR,$dir) or die $!;
my $changes=0;
print "\n :: find :. ";
$find=<STDIN>; chop($find);
print " :: replace :. ";
$replace=<STDIN>; chop($replace);
$/=undef;
foreach my $file (readdir(DIR))
{
if($file=~/^.+\.xml$/) # adjust pattern to your needs
{
print "parsing file $file..\n";
open(FILE,"<$file") or die $!;
my $file_data=<FILE>;
close(FILE);
$changes += $file_data =~ s/$find/$replace/gme;
open(FILE,">$file") or die $!;
print FILE $file_data;
close(FILE);
}
}
print "Voila. ($changes changes)\n";

The data are now ready to be imported.

Importing the XML Files

You will need to create a table that has the same format as the

XML schema used to import the data.  The XML Schema is below;

the table name is referenced as sql:relation="<tablename>".

The following is saved as a file called FAQschema.xml.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"

xmlns:sql="urn:schemas-microsoft-com:mapping-schema">

   

<xsd:element name="FAQ" sql:relation="FAQ" >

       

<xsd:complexType>

           

<xsd:sequence>

           

<xsd:element name="Question" type="xsd:string" sql:use-cdata="1" />

           

<xsd:element name="Summary" type="xsd:string" sql:use-cdata="1" />

           

<xsd:element name="Content" type="xsd:string" sql:use-cdata="1" />

           

</xsd:sequence>

           

<xsd:attribute name="ID" type="xsd:string" />

           

<xsd:attribute name="LanguageCode" type="xsd:string" />

       

</xsd:complexType>

   

</xsd:element>

</xsd:schema>

Once the table and the schema are ready, you can use the

SQLXMLBulkLoad object and the File System Object to iterate through each

file and insert the data into the table.  The following is a VBScript

file used to do the import and write success/errors to a file. Save the

file with a .vbs extension and run from a command prompt in the same

directory as the XML files.

set objBL = CreateObject("SQLXMLBulkLoad.SQLXMLBulkload.3.0")
objBL.ConnectionString = "provider=SQLOLEDB;datasource=localhost;database=TEST;integrated security=SSPI"
objBL.ErrorLogFile = "C:\FAQImport\error.log"
' Here is the path to your XML files
Const path = "C:\FAQImport\FAQs\"
Dim Text, Title, oFile
Dim fso, oFolder, oFiles, wsh ' Object variables
Text = "Folder "
Title = "XML Files"
Set wsh = WScript.CreateObject("WScript.Shell")
' Create FileSystemObject object to access the file system.
Set fso = CreateObject("Scripting.FileSystemObject")
' Get Folder object.
Set oFolder = fso.GetFolder(wsh.ExpandEnvironmentStrings(path)) 
' Get All Files
Set oFiles = oFolder.Files
For Each oFile In oFiles
If oFile.Type = "XML Document" Then 
Text = Text & oFolder & vbCrLf & vbCrLf
Text = Text & "Name" & vbTab & vbTab & "Size" & vbCrLf
Text = Text & oFile.Name & vbTab
Text = Text & FormatNumber(oFile.Size, 0) & vbCrLf
objBL.Execute "C:\FAQImport\FAQschema.xml", oFile.Name
End If
Next
Dim IFile
Set IFile = fso.CreateTextFile("C:\FAQImport\results.log", True)
IFile.WriteLine(Text)
IFile.Close
MsgBox "FAQ import successful!", vbOKOnly + vbInformation, Title
Set objBL = Nothing
Set fso = Nothing

Conclusions

You can usually find many examples about how to

consume XML data in SQL Server or inversely how to import/export a single XML file.  This article looks at how to import multiple XML files.

Personally, I couldn't find much on this particular topic so I decided

to write this up. I hope you find it useful!

Download the code

Credits

1) VBScript: The basis of the file/folder iteration

come from a script developed by

Günter Born.

2) XSD and SQLXML examples are derived from the Help files in

the SQLXML download.

3) Perl Script: This is something I got off of

CPAN a few years back and I've found

it to be invaluable.  If you know the source PLEASE contact me so I

can give him or her proper credit.

Rate

3.38 (8)

You rated this post out of 5. Change rating

Share

Share

Rate

3.38 (8)

You rated this post out of 5. Change rating