Returning multiple rows from XML

  • Greetings,

    I am new to XML and am struggling with shredding an XML column into a tabular format.

    Here is what I have so far.

    DECLARE @t Table

    (

    XMLCol XML

    )

    INSERT INTO @t Values('<RESPONSE>

    <EXPR>CHF</EXPR>

    <EXCH>USD</EXCH>

    <AMOUNT>1</AMOUNT>

    <NPRICES>1</NPRICES>

    <CONVERSION>

    <DATE>Fri, 01 Jun 2012 22:50:00 GMT</DATE>

    <ASK>1.7956</ASK>

    <BID>1.7946</BID>

    </CONVERSION>

    <EXPR>EUR</EXPR>

    <EXCH>USD</EXCH>

    <AMOUNT>1</AMOUNT>

    <NPRICES>1</NPRICES>

    <CONVERSION>

    <DATE>Sat, 02 Jun 2012 22:50:00 GMT</DATE>

    <ASK>1.1794</ASK>

    <BID>1.1787</BID>

    </CONVERSION>

    </RESPONSE>')

    SELECT x.y.value('(EXPR)[1]','VARCHAR(3)') AS 'EXPR'

    , x.y.value('(EXCH)[1]','VARCHAR(3)') AS 'EXCH'

    , x.y.value('(/RESPONSE/CONVERSION/DATE)[1]','VARCHAR(30)') AS 'DATE'

    , x.y.value('(/RESPONSE/CONVERSION/ASK)[1]','FLOAT') AS 'ASK'

    , x.y.value('(/RESPONSE/CONVERSION/BID)[1]','FLOAT') AS 'BID'

    FROM @t as t

    CROSS APPLY t.XMLCol.nodes('/RESPONSE') AS x(y)

    This will return the format I need, but only for the first row. How do you use value and return multiple rows? Or what should I be using instead of value

    Here is what I need the results to look like.

    EXPREXCHDATEASKBID

    CHFUSDFri, 01 Jun 2012 22:50:00 GMT1.79561.7946

    EURUSDFri, 01 Jun 2012 22:50:00 GMT1.17941.1787

    Thanks,

    Tim

  • Hi,

    Normally you can simply use the nodes() method to shred the xml and this will return you a nice dataset. However, with your example XML, I see that the data seems to be at the same level and that there is no easy grouping. This has made it a bit trickier, but it is still possible.. Basically I have come up with the following:

    DECLARE @t XML

    SET @t = '<RESPONSE>

    <EXPR>CHF</EXPR>

    <EXCH>USD</EXCH>

    <AMOUNT>1</AMOUNT>

    <NPRICES>1</NPRICES>

    <CONVERSION>

    <DATE>Fri, 01 Jun 2012 22:50:00 GMT</DATE>

    <ASK>1.7956</ASK>

    <BID>1.7946</BID>

    </CONVERSION>

    <EXPR>EUR</EXPR>

    <EXCH>USD</EXCH>

    <AMOUNT>1</AMOUNT>

    <NPRICES>1</NPRICES>

    <CONVERSION>

    <DATE>Sat, 02 Jun 2012 22:50:00 GMT</DATE>

    <ASK>1.1794</ASK>

    <BID>1.1787</BID>

    </CONVERSION>

    </RESPONSE>'

    SELECT x.y.value('(for $x in (//EXPR)

    return

    if ($x << . and $x) then $x else ()

    )[last()]','VARCHAR(3)') AS 'EXPR'

    , x.y.value('(for $x in (//EXCH)

    return

    if ($x << . and $x) then $x else ()

    )[last()]','VARCHAR(3)') AS 'EXCH'

    , x.y.value('(DATE/text())[1]','VARCHAR(30)') AS 'DATE'

    , x.y.value('(ASK/text())[1]','FLOAT') AS 'ASK'

    , x.y.value('(BID/text())[1]','FLOAT') AS 'BID'

    FROM @t.nodes('/RESPONSE') T(c)

    CROSS APPLY T.c.nodes('CONVERSION') AS x(y)

    What the code is doing is it is using the nodes() method to get the XML and then using the CROSS APPLY operator, it is essentially creating a record for each <CONVERSION> node.

    For nodes under <CONVERSION> it is really easy to get those values as shown in the query. However, to get the other values, I had to use a flwor statement to make sure that i retrieved the nodes immediately above the <CONVERSION> section that is in scope. What the flwor statements are doing is that they are getting a sequence of //EXPR nodes (or //EXCH nodes) that are positioned before the <CONVERSION> node in the XML. That is achieved by using the << comparison. Then with that sequence of nodes, it returns the last() one of that sequence. The last one would be the one that is nearest to the <CONVERSION> node.

    Hope this helps.

  • That works perfect! Thank you very much!

    Tim

  • T_Peters (10/30/2012)


    That works perfect! Thank you very much!

    Tim

    No problem, thanks for the feedback.

    I did notice that i left something unnecessary in the code from my testing :ermm:

    You don't need the "and $x" in the flwor statements. So it should actually look like this:

    if ($x << .) then $x else ()

  • Great, thank you again for your help.

    Tim

  • Nice job, Arthur.

    No reflection on your fine code. Just an observation about XML. I'm absolutely amazed at how expensive this type of XML processing actually is. The underlying table valued function does...

    2 Full Cartesian Products (Square Joins)

    2 Triangular Joins

    2 Double Joins (2x # of rows)

    6 Single Joins (1x # of rows)

    The CPU time it took to process 16 "sets" (1 set per result row) of such data on my admittedly 10 year old desktop was 32ms. Now, even if that performance was linear (and it won't be), it would take 8.8 hours to import a lousy million rows where something like BCP importing a million row delimited file containing the same data takes well less than a minute.

    Then, there's the impact on the pipe and the IO system. A million "rows" of this type of XML data (sans any unnecessary spaces and all end of line characters) is about 168 million characters. A delimited file with the same information AND 2 character end of line markers is only about 42 million characters or about one fourth as much IO.

    I know I sound like a Luddite, but I'm just not seeing any advantage of using "flat" XML to transmit and import "flat" data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah, yes... and to add insult to injury, if the dates and times were actually in a standard format, the delimited file would only contain 28 million bytes... that's only 1/6th the IO and pipe load of the XML.

    That's just insane to me and I don't understand the draw that people have towards XML for such "flat" data transmission.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (10/31/2012)


    Nice job, Arthur.

    No reflection on your fine code. Just an observation about XML. I'm absolutely amazed at how expensive this type of XML processing actually is.

    Hi Jeff.

    Thanks for your comments. I totally agree with you regarding how expensive these types of queries are. For sure, the query I posted could be fine tuned, but it will still be much much more expensive then using a flat file for example. If any XML data is as flat as it is in the example, then XML isn't the best option to carry that data.

    It does seem that to move data around these days, XML seems to be the default choice. Probably in part being driven by middle tier/app developers rather than DBA/Developers as you can serialize huge chunks of data to an xml file in about 5 lines of code (or less!). Give me flat files to import any day!

  • Thanks guys. Unfortunately as is often the case I have to process the data with the format I have, not the format I might want or wish to have at a later time. I wouldn't choose XML, but that isn't up to me. Also on the bright side, it's daily exchange rate data and unless there is an explosion in new currencies, I don't think I'll be processing upwards of a million records.

    Tim

  • arthurolcot (11/1/2012)


    Jeff Moden (10/31/2012)


    Nice job, Arthur.

    No reflection on your fine code. Just an observation about XML. I'm absolutely amazed at how expensive this type of XML processing actually is.

    Hi Jeff.

    Thanks for your comments. I totally agree with you regarding how expensive these types of queries are. For sure, the query I posted could be fine tuned, but it will still be much much more expensive then using a flat file for example. If any XML data is as flat as it is in the example, then XML isn't the best option to carry that data.

    It does seem that to move data around these days, XML seems to be the default choice. Probably in part being driven by middle tier/app developers rather than DBA/Developers as you can serialize huge chunks of data to an xml file in about 5 lines of code (or less!). Give me flat files to import any day!

    Nice to see a kindred spirit on the subject. Thanks for the feedback, Arthur.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Are there any books that explain syntax such as $x or //EXPR

    ?

  • grovelli-262555 (11/2/2012)


    Are there any books that explain syntax such as $x or //EXPR

    ?

    Hi grovelli,

    $x is a variable within a flwor statement. You can begin to learn about flwor statements here: http://msdn.microsoft.com/en-us/library/ms190945.aspx

    //EXPR is a path expression. A starting point again would be BOL here: http://msdn.microsoft.com/en-us/library/ms189919.aspx

    With regard to a book regarding XML itself, there are a good couple of good SQL books pretty much dedicated to the XML data type and how to work with it e.g. XQuery, XML DML etc. I don't have a particular one to recommend as it depends on exactly what you want to learn, but the ones by Wrox or Apress are probably good starting points if you want to learn more about SQL XML.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply