Extracting data from a column with xml data

  • How do you extract the data from the message column in the following table. The column [Message] has xml data. The table only has two rows and three columns. In particular, I am interested in getting the Invoice number, TransactionID, referenceNumber (and maybe comment)

    STARTTIMESTEPNUMBERMessage

    23:08.03<?xml version="1.0"?> <Payment MessageLabel="GP_PayInvoiceToAR"> <SubID>0</SubID> <CCType>Visa</CCType> <CCLast4Digits> </CCLast4Digits> <CCExpirationDate>12/2024</CCExpirationDate> <TransactionID>2152966835</TransactionID> <PaymentAmount>2.06</PaymentAmount> <PaymentDate>Aug 20 2012</PaymentDate> <ReferenceNumber>23152</ReferenceNumber> <Comment>INV00000149243</Comment> <Invoices> <Invoice> <InvoiceNumber>INV00000149243</InvoiceNumber> <InvoiceAmount>2.08</InvoiceAmount> <ContractNumber> </ContractNumber> <PaidThruDate>Jan 01 1900</PaidThruDate> </Invoice> </Invoices> </Payment>

    23:08.04<?xml version="1.0"?> <Payment MessageLabel="GP_PayInvoiceToAR"> <SubID>0</SubID> <CCType>Visa</CCType> <CCLast4Digits> </CCLast4Digits> <CCExpirationDate>12/2024</CCExpirationDate> <TransactionID>2152966836</TransactionID> <PaymentAmount>2.06</PaymentAmount> <PaymentDate>Aug 20 2012</PaymentDate> <ReferenceNumber>23153</ReferenceNumber> <Comment>INV00000149244</Comment> <Invoices> <Invoice> <InvoiceNumber>INV00000149244</InvoiceNumber> <InvoiceAmount>2.08</InvoiceAmount> <ContractNumber> </ContractNumber> <PaidThruDate>Jan 01 1900</PaidThruDate> </Invoice> </Invoices> </Payment>

    Thanks

  • Something like this:

    IF OBJECT_ID(N'tempdb..#T') IS NOT NULL

    DROP TABLE #T;

    CREATE TABLE #T

    (ID INT IDENTITY

    PRIMARY KEY,

    XMLCol XML);

    INSERT INTO #T

    (XMLCol)

    VALUES ('<?xml version="1.0"?>

    <Payment MessageLabel="GP_PayInvoiceToAR">

    <SubID>0</SubID>

    <CCType>Visa</CCType>

    <CCLast4Digits />

    <CCExpirationDate>12/2024</CCExpirationDate>

    <TransactionID>2152966835</TransactionID>

    <PaymentAmount>2.06</PaymentAmount>

    <PaymentDate>Aug 20 2012</PaymentDate>

    <ReferenceNumber>23152</ReferenceNumber>

    <Comment>INV00000149243</Comment>

    <Invoices>

    <Invoice>

    <InvoiceNumber>INV00000149243</InvoiceNumber>

    <InvoiceAmount>2.08</InvoiceAmount>

    <ContractNumber />

    <PaidThruDate>Jan 01 1900</PaidThruDate>

    </Invoice>

    </Invoices>

    </Payment>');

    SELECT XMLCol.value('(/Payment/TransactionID/text())[1]', 'varchar(100)') AS TransactionID,

    XMLCol.value('(/Payment/ReferenceNumber/text())[1]', 'varchar(100)') AS ReferenceNumber,

    Payment.Invoice.query('.').value('(/Invoice/InvoiceNumber/text())[1]', 'varchar(100)') AS InvoiceNumber

    FROM #T AS T

    CROSS APPLY XMLCol.nodes('Payment/Invoices/Invoice') AS Payment (Invoice);

    What you need to look up is SQL Server XQuery, specifically the nodes() and value() functions. Beyond that, it's W3C XML standards, if you haven't gone through those (trust me, it's painful).

    (Edit for readability and layout on the XML.)

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Mr. GSquared,

    Thanks for the prompt response. However, I expected 2 rows in the result set. I dont understand where the other row disappeared to. Just to make it clear, the source of the xml data would be a column in another table/query. So the number of rows in that column would vary, and I need all the data in the rows to be used. I hope I make sense!!

  • GSquared...I figure it out. It was all in your solution. THANK YOU A MILLION TIMES!!!

  • You're welcome.

    On the two rows (as you figured out), I only copied one into my test harness. You'll have two (or more) when you use real data.

    However, you should be aware that the XML structure implies the possibility of more than one invoice per entry. It has "Invoices" as a tag with "Invoice" as a sub-set of it. Thus, you could end up with even more rows than you expect, if one or more of the entries have two or more invoices.

    For example:

    IF OBJECT_ID(N'tempdb..#T') IS NOT NULL

    DROP TABLE #T;

    CREATE TABLE #T

    (ID INT IDENTITY

    PRIMARY KEY,

    XMLCol XML);

    INSERT INTO #T

    (XMLCol)

    VALUES ('<?xml version="1.0"?>

    <Payment MessageLabel="GP_PayInvoiceToAR">

    <SubID>0</SubID>

    <CCType>Visa</CCType>

    <CCLast4Digits />

    <CCExpirationDate>12/2024</CCExpirationDate>

    <TransactionID>2152966835</TransactionID>

    <PaymentAmount>2.06</PaymentAmount>

    <PaymentDate>Aug 20 2012</PaymentDate>

    <ReferenceNumber>23152</ReferenceNumber>

    <Comment>INV00000149243</Comment>

    <Invoices>

    <Invoice>

    <InvoiceNumber>INV00000149243</InvoiceNumber>

    <InvoiceAmount>2.08</InvoiceAmount>

    <ContractNumber />

    <PaidThruDate>Jan 01 1900</PaidThruDate>

    </Invoice>

    <Invoice>

    <InvoiceNumber>INV00000149244</InvoiceNumber>

    <InvoiceAmount>2.08</InvoiceAmount>

    <ContractNumber />

    <PaidThruDate>Jan 01 1900</PaidThruDate>

    </Invoice>

    </Invoices>

    </Payment>');

    SELECT XMLCol.value('(/Payment/TransactionID/text())[1]', 'varchar(100)') AS TransactionID,

    XMLCol.value('(/Payment/ReferenceNumber/text())[1]', 'varchar(100)') AS ReferenceNumber,

    Payment.Invoice.query('.').value('(/Invoice/InvoiceNumber/text())[1]', 'varchar(100)') AS InvoiceNumber

    FROM #T AS T

    CROSS APPLY XMLCol.nodes('Payment/Invoices/Invoice') AS Payment (Invoice);

    All I did was add another invoice (I added 1 to the invoice number for it), and you'll now get two rows from the query, one for each invoice. The TransactionID and ReferenceNumber columns will have the same data in both rows, but you'll get both invoice numbers (1 per row).

    Make sure that either the data doesn't contain anything like that, and never will, or that your code can deal with that correctly.

    Make sense?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply