Large XML into SQL Server

  • Hey, all.. Thought I'd post this issue here, as I haven't found a suitable answer anywhere else that exactly addresses this problem.

    First, the problem..  I have a large (72 meg, 819,000 records) XML file in hierarchical format that I need to bulk load into a table (maybe 2 tables, because of the hierarchy).

    Second, my constraints..  We are using BEA WebLogic in which to develop.  That's Java.  MS doesn't like Java, so we're kinda euchred, there.  Ok, fine.. I wrote a VBS script to call the bulkloader and everything worked fine.. But the architect team told us that VBS has been disabled on the live servers, so using VBS is not an option.  Ok, seeing as I'm stuck in Java and can't use VBS, that sort of leaves out the bulkloader, doesn't it..

    Third, the solution.  That's what you guys hopefully come in.  Do I need to use bulk insert or bcp in a stored procedure?  I can find samples of how to do this with a csv file or something similar, but examples using xml are scarcer than brains in Ottawa.

    Fourth, for those inclined to investigate the issue further, I am including my schema and a sample xml file below.  This is the schema I used for the bulk loader with VBS, and it worked. It put stuff into 2 tables, an Item table and a BusinessUnit table for the meta info.

    Thank you in advance for any help/examples/advice you can give.  Of course, everyone wants this solution done yesterday.

    <?xml version="1.0" encoding="UTF-8"?>

    <xs:schema xmlns="stsi/amextan" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:sql="urn:schemas-microsoft-com:mapping-schema" targetNamespace="stsi/amextan" elementFormDefault="qualified">

     <xs:annotation>

      <xs:appinfo>

       <sql:relationship name="BU_Items" parent="INT001_BusinessUnit" parent-key="TAN_PREFIX" child="INT001_Item" child-key="TAN_PREFIX"/>

      </xs:appinfo>

     </xs:annotation>

     <xs:element name="AMEXTANToACNTAN" sql:is-constant="true">

      <xs:complexType>

       <xs:sequence>

        <xs:element name="BusinessUnit" sql:relation="INT001_BusinessUnit">

         <xs:complexType>

          <xs:sequence>

           <xs:element name="ITEM" sql:relation="INT001_Item" sql:relationship="BU_Items">

            <xs:complexType>

             <xs:sequence>

              <xs:element name="ChannelType" type="xs:string" sql:field="CHANNEL_TYPE"/>

              <xs:element name="TANStatus" type="xs:string" sql:field="STATUS"/>

              <xs:element name="TIN" type="xs:string" minOccurs="0" sql:field="TIN"/>

             </xs:sequence>

             <xs:attribute name="TAN" use="required" sql:field="TAN_NUMBER"/>

            </xs:complexType>

           </xs:element>

          </xs:sequence>

          <xs:attribute name="Code" type="xs:string" use="required" sql:field="TAN_PREFIX"/>

          <xs:attribute name="TANCount" type="xs:long" use="required" sql:field="TAN_COUNT"/>

         </xs:complexType>

        </xs:element>

       </xs:sequence>

       <xs:attribute name="DateTimeCreated" type="xs:string" use="required" sql:mapped="false"/>

       <xs:attribute name="RecordCount" type="xs:long" use="required" sql:mapped="false"/>

       <xs:attribute name="SerialNum" type="xs:long" use="required" sql:mapped="false"/>

      </xs:complexType>

     </xs:element>

    </xs:schema>

     

    <?xml version="1.0"?>

    <AMEXTANToACNTAN DateTimeCreated="20040127165815" RecordCount="3" xmlns="stsi/amextan" SerialNum="31">

        <BusinessUnit Code="AAAA" TANCount="2">

            <ITEM TAN="AAAA00324">

                <ChannelType>P</ChannelType>

                <TANStatus>A</TANStatus>

                <TIN></TIN>

            </ITEM>

            <ITEM TAN="AAAA98422">

                <ChannelType>R</ChannelType>

                <TANStatus>U</TANStatus>

                <TIN>1234567</TIN>

            </ITEM>

        </BusinessUnit>

        <BusinessUnit Code="BBBB" TANCount="1">

            <ITEM TAN="BBBB00420">

                <ChannelType>P</ChannelType>

                <TANStatus>C</TANStatus>

                <TIN>       </TIN>

            </ITEM>

        </BusinessUnit>

    </AMEXTANToACNTAN>

  • If you can't use Java to go straight from XML into SQL Server, can you use Java to convert XML into CSV and then use BCP or bulk insert to import the CSV into SQL?  Better yet, you could probably write an XSL transform to convert XML into CSV.  Just my initial thoughts...

  • That's actually a possibility.. It's not very elegant, of course.. But we can curse XML for that.

    We may very well go that route if the powers that be determine our current route isn't kosher.

    What we hope to do is use Microsoft's SQLXMLBulkLoader, wrapped in a C++ class which is compiled into an executable, to accomplish this.  I've tested this, and it works. 

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply