November 25, 2014 at 7:38 pm
Hi,
I've just been given an xml file to import to SQL. I haven't looked at xml files before, but from reading some previous posts I thought I'd try it via a SSIS XML Source task. The problem I'm having is when I try and 'Generate XSD' not all the information I require is being returned, i.e. all of these elements are not showing within the 'Fields' table (full xml and xsd files attached).
- <Fields>
<Field Name="InvoiceNumber" Type="invoicenumber">00072328</Field>
<Field Name="InvoiceDate" Type="invoicedate">20141119</Field>
<Field Name="SubTotal" Type="invoicetotalvatexcludedamount">4140.00</Field>
<Field Name="GSTTotal" Type="invoicetotalvatamount">414.00</Field>
<Field Name="InvoiceTotal" Type="invoicetotalvatincludedamount">4554.00</Field>
<Field Name="PONumber" Type="invoiceordernumber">974250</Field>
<Field Name="ABNNumber" Type="suppliertaxnumber1">1111111111</Field>
<Field Name="InvoiceJobNumber" Type="invoicejobnumber" />
<Field Name="ProgressClaimNumber" Type="progressclaimnumber" />
<Field Name="ProgressClaimDate" Type="progressclaimdate" />
<Field Name="JobNumber" Type="jobnumber" />
</Fields>
It seems to be because the lines start 'Field Name' rather than <name of field>, but I'm not sure where to begin.
Any help would be greatly appreciated!
November 26, 2014 at 12:05 am
There are few options, use XSLT to transform the structure, change the XSD using something like xsd:all and the Type/Name attributes or simply parse the XML with XQuery in SQL Server, the last would probably be my preference, quick sample below.
DECLARE @TXML XML = '<Documents xmlns:xsi="http://XMLSchema-instance" xsi:noNamespaceSchemaLocation="ReadsoftInvoice.xsd">
<Document Version="1.0" Name="XML Invoice" Type="Invoice" DocumentReference="" ImageReference="" CertificateReference="">
<Invoice>
<Supplier>
<Name>PURCHASE COMPANY PTY LTD</Name>
<Identifier>5263</Identifier>
<Number>BSF01</Number>
<Description>CG</Description>
<CorporateGroupNumber>1</CorporateGroupNumber>
<POBox />
<Street>369 STREET QLD 4133</Street>
<StreetSupplement />
<PostalCode />
<City />
<CountryCode>AUS</CountryCode>
<CountryName />
<TelephoneNumber>07 123 4567</TelephoneNumber>
<FaxNumber />
<VATRegistrationNumber>11111111111</VATRegistrationNumber>
</Supplier>
<Buyer>
<Name>Companyname</Name>
<Identifier>1</Identifier>
<Number>1</Number>
<Description />
<CorporateGroupNumber>1</CorporateGroupNumber>
<POBox />
<Street />
<StreetSupplement />
<PostalCode />
<City />
<CountryCode>AUS</CountryCode>
<CountryName />
<TelephoneNumber />
<FaxNumber />
<VATRegistrationNumber />
</Buyer>
<Fields>
<Field Name="InvoiceNumber" Type="invoicenumber">00072328</Field>
<Field Name="InvoiceDate" Type="invoicedate">20141119</Field>
<Field Name="SubTotal" Type="invoicetotalvatexcludedamount">4140.00</Field>
<Field Name="GSTTotal" Type="invoicetotalvatamount">414.00</Field>
<Field Name="InvoiceTotal" Type="invoicetotalvatincludedamount">4554.00</Field>
<Field Name="PONumber" Type="invoiceordernumber">974250</Field>
<Field Name="ABNNumber" Type="suppliertaxnumber1">1111111111</Field>
<Field Name="InvoiceJobNumber" Type="invoicejobnumber" />
<Field Name="ProgressClaimNumber" Type="progressclaimnumber" />
<Field Name="ProgressClaimDate" Type="progressclaimdate" />
<Field Name="JobNumber" Type="jobnumber" />
</Fields>
<Tables>
<Table Type="LineItem" />
<Table Type="BankAccountTable" />
<Table Type="VATTable" />
</Tables>
</Invoice>
<System>
<Field Type="Pages">\\\BO954044.TIF.pdf \\\Images\BO954044.TIF.pdf</Field>
<Field Type="Appendices" />
<Field Type="BatchDateTime">20141124 09:54:34</Field>
<Field Type="BatchPrefix" />
<Field Type="BatchNumber" />
<Field Type="BatchIndex">0</Field>
<Field Type="DateTime">20141124 09:54:34</Field>
<Field Type="Currency" />
<Field Type="CreditInvoice">false</Field>
<Field Type="Description" />
<Field Type="Originator" />
<Field Type="Title" />
<Field Type="Identification" />
<Field Type="ImageFile1">\\Images\BO954044.TIF.pdf</Field>
<Field Type="ImageFile2" />
<Field Type="NumberOfPages">2</Field>
<Field Type="NumberOfPagesFrontAndBack">2</Field>
<Field Type="NumberOfInvoicePages">2</Field>
<Field Type="NumberOfInvoicePagesFrontAndBack">2</Field>
<Field Type="NumberOfAppendixPages">0</Field>
<Field Type="NumberOfAppendixPagesFrontAndBack">0</Field>
<Field Type="ProfileName">AP_INVOICE</Field>
<Field Type="EndorserNumber" />
<Field Type="Status">Approved</Field>
<Field Type="Supplier">PURCASHE COMPANY PTY LTD</Field>
<Field Type="SupplierDescription">CG</Field>
<Field Type="UserName">test.user@company</Field>
<Field Type="UserRemark" />
<Field Type="GUID">{26B0ECDD-7C5C-4755-807A-7367A47F2C7D}</Field>
</System>
<ProcessLog>
<ProcessMessage>
<TimeStamp>2014-11-24 09:54:36</TimeStamp>
<Type>System</Type>
<Module>Scan</Module>
<Action>Scanned</Action>
<Owner>ap.scanning@company</Owner>
<sMessage />
</ProcessMessage>
<ProcessMessage>
<TimeStamp>2014-11-24 09:56:28</TimeStamp>
<Type>System</Type>
<Module>Interpret</Module>
<Action>Interpreted</Action>
<Owner>administrator</Owner>
<sMessage>Identified as definition = 3773, PURCASHE COMPANY PTY LTD (BSF01). Identifier(s) found: Image</sMessage>
</ProcessMessage>
<ProcessMessage>
<TimeStamp>2014-11-24 10:50:52</TimeStamp>
<Type>System</Type>
<Module>Verify</Module>
<Action>Approved</Action>
<Owner>test.user@hutch</Owner>
<sMessage />
</ProcessMessage>
</ProcessLog>
</Document>
</Documents>';
SELECT
document.DATA.value('@Version' ,'VARCHAR(50)') AS [Version]
,document.DATA.value('@Name' ,'VARCHAR(50)') AS [Name]
,document.DATA.value('@Type' ,'VARCHAR(50)') AS [Type]
,document.DATA.value('@DocumentReference' ,'VARCHAR(50)') AS DocumentReference
,document.DATA.value('@ImageReference' ,'VARCHAR(50)') AS ImageReference
,document.DATA.value('@CertificateReference' ,'VARCHAR(50)') AS CertificateReference
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "InvoiceNumber" ]) [1]' ,'VARCHAR(50)') AS InvoiceNumber
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "InvoiceDate" ]) [1]' ,'VARCHAR(50)') AS InvoiceDate
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "SubTotal" ]) [1]' ,'VARCHAR(50)') AS SubTotal
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "GSTTotal" ]) [1]' ,'VARCHAR(50)') AS GSTTotal
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "InvoiceTotal" ]) [1]' ,'VARCHAR(50)') AS InvoiceTotal
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "PONumber" ]) [1]' ,'VARCHAR(50)') AS PONumber
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "ABNNumber" ]) [1]' ,'VARCHAR(50)') AS ABNNumber
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "InvoiceJobNumber" ]) [1]' ,'VARCHAR(50)') AS InvoiceJobNumber
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "ProgressClaimNumber" ]) [1]','VARCHAR(50)') AS ProgressClaimNumber
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "ProgressClaimDate" ]) [1]' ,'VARCHAR(50)') AS ProgressClaimDate
,INVOICE_FIELDS.DATA.value('(Field[ @Name = "JobNumber" ]) [1]' ,'VARCHAR(50)') AS JobNumber
FROM @TXML.nodes('Documents/Document') AS DOCUMENT(DATA)
OUTER APPLY document.DATA.nodes('Invoice/Fields') AS INVOICE_FIELDS(DATA);
Results
Version Name Type DocumentReference ImageReference CertificateReference InvoiceNumber InvoiceDate SubTotal GSTTotal InvoiceTotal PONumber ABNNumber InvoiceJobNumber ProgressClaimNumber ProgressClaimDate JobNumber
-------- ------------ -------- ------------------ --------------- --------------------- -------------- ------------ --------- --------- ------------- --------- ----------- ----------------- -------------------- ------------------ ----------
1.0 XML Invoice Invoice 00072328 20141119 4140.00 414.00 4554.00 974250 1111111111
November 26, 2014 at 3:07 pm
That's awesome. Exactly what I was after.
Thanks so much,
Craig
Viewing 3 posts - 1 through 3 (of 3 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy