Creating a format file to import XML data

  • Hello gurus and geniuses

    Please forgive me starting yet another thread on this subject. I have searched the forum and read some interesting articles but none actually describe or help me with the actual task, which is....

    ...I have to import an XML data file into an existing SQL database. The file has been provided to me by an external source and I have no control over its creation. I am completely new to the concept of using XML-formatted data, so I've been hastily reading up on the methods used to do this.

    As an example of the file I'm working with, it looks something like this:

    <?xml version="1.0" encoding="utf-8"?>

    <ArrayOfOpenOrders xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

    <OpenOrders>

    <ExtensionData />

    <COD_SALESORDER>0000812383</COD_SALESORDER>

    <DATE_SALESORDER>2013-02-27T17:11:42+00:00</DATE_SALESORDER>

    <REQDATE_SALESORDER>2013-03-05</REQDATE_SALESORDER>

    <COD_DOCTYPE>ZOR </COD_DOCTYPE>

    <DSC_DOCTYPE>Standard Order</DSC_DOCTYPE>

    <OrderDeliveryStatus_Code>C</OrderDeliveryStatus_Code>

    .

    .

    .

    <BILL_DOC_ITEM_REFNUMBER />

    </OpenOrders>

    There are a total of 36 field elements in each record. The file itself can contain anything up to around 15,000 records, so therefore can be huge. I want to import that data into a table, with each field value mapped to its respective column in the table.

    I am now happy in principle with the idea of using OPENROWSET to retrieve the data from the XML file. As a test, I tried writing the following T-SQL script to view the contents:

    declare @d xml

    select @d = d

    from openrowset (bulk 'D:\temp\Testing\Test.xml', single_blob) as OpenOrders(d)

    select @d

    When I ran it, I got no results. By process of elimination and reading another thread I found on the forum, I think the problem is related to the size of the file. If I cut the majority of records from the file, and retain no more than 26, I get a result. Is this related to the maximum size of varbinary(max), which is 8000 bytes or so?

    From the same thread I read that to import data that's over this limit, I would need to use a format file instead of specifying SINGLE_BLOB. This is where I'm struggling!

    I have read up on the msdn website about creating a format file, and I understand that you can create either an XML or a non-XML format file. However, although there are examples on the site, they don't actually show what the source files or the end results look like, so I'm finding it difficult to envisage what I need to do.

    Q1. - is there a direct correlation between the data file you're trying to import and the type of format file you need to create? In other words, if I have a non-XML data file do I need to create a non-XML format file, and an XML format file for XML data? Or can you create either type for either data file?

    Q2. - how do I create a format file to describe the format of the XML data?

    On the msdn website I'm looking at this article:

    Example E shows how to map XML data to a table, but I'm confused by the use of an xml column in the destination SQL table. Does this mean that the entire record in the XML data file would be mapped to one xml column?

    I'd be grateful for any further explanations!

    Julian

  • Decomposing an XML document into relational tables, rows and columns is commonly called "shredding", and you can find various articles on this, although I have found some of them to be a bit confusing myself.

    If you have the XML in a variable, as you indicated, then one method that worked for me is described in the vocabulary example in this article, although that was tested in SQL Server 2008 R2, not 2005. The example covers splitting the data into two related tables, which may be a more typical use case than a single table.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply