|
|
|
Right there with Babe
      
Group: General Forum Members
Last Login: Today @ 5:29 PM
Points: 728,
Visits: 1,088
|
|
Greetings,
I am new to XML and am struggling with shredding an XML column into a tabular format.
Here is what I have so far.
DECLARE @t Table ( XMLCol XML )
INSERT INTO @t Values('<RESPONSE> <EXPR>CHF</EXPR> <EXCH>USD</EXCH> <AMOUNT>1</AMOUNT> <NPRICES>1</NPRICES> <CONVERSION> <DATE>Fri, 01 Jun 2012 22:50:00 GMT</DATE> <ASK>1.7956</ASK> <BID>1.7946</BID> </CONVERSION> <EXPR>EUR</EXPR> <EXCH>USD</EXCH> <AMOUNT>1</AMOUNT> <NPRICES>1</NPRICES> <CONVERSION> <DATE>Sat, 02 Jun 2012 22:50:00 GMT</DATE> <ASK>1.1794</ASK> <BID>1.1787</BID> </CONVERSION> </RESPONSE>')
SELECT x.y.value('(EXPR)[1]','VARCHAR(3)') AS 'EXPR' , x.y.value('(EXCH)[1]','VARCHAR(3)') AS 'EXCH' , x.y.value('(/RESPONSE/CONVERSION/DATE)[1]','VARCHAR(30)') AS 'DATE' , x.y.value('(/RESPONSE/CONVERSION/ASK)[1]','FLOAT') AS 'ASK' , x.y.value('(/RESPONSE/CONVERSION/BID)[1]','FLOAT') AS 'BID' FROM @t as t CROSS APPLY t.XMLCol.nodes('/RESPONSE') AS x(y)
This will return the format I need, but only for the first row. How do you use value and return multiple rows? Or what should I be using instead of value
Here is what I need the results to look like.
EXPR EXCH DATE ASK BID CHF USD Fri, 01 Jun 2012 22:50:00 GMT 1.7956 1.7946 EUR USD Fri, 01 Jun 2012 22:50:00 GMT 1.1794 1.1787
Thanks, Tim
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Monday, May 20, 2013 9:04 AM
Points: 1,722,
Visits: 1,404
|
|
Hi, Normally you can simply use the nodes() method to shred the xml and this will return you a nice dataset. However, with your example XML, I see that the data seems to be at the same level and that there is no easy grouping. This has made it a bit trickier, but it is still possible.. Basically I have come up with the following:
DECLARE @t XML SET @t = '<RESPONSE> <EXPR>CHF</EXPR> <EXCH>USD</EXCH> <AMOUNT>1</AMOUNT> <NPRICES>1</NPRICES> <CONVERSION> <DATE>Fri, 01 Jun 2012 22:50:00 GMT</DATE> <ASK>1.7956</ASK> <BID>1.7946</BID> </CONVERSION> <EXPR>EUR</EXPR> <EXCH>USD</EXCH> <AMOUNT>1</AMOUNT> <NPRICES>1</NPRICES> <CONVERSION> <DATE>Sat, 02 Jun 2012 22:50:00 GMT</DATE> <ASK>1.1794</ASK> <BID>1.1787</BID> </CONVERSION> </RESPONSE>'
SELECT x.y.value('(for $x in (//EXPR) return if ($x << . and $x) then $x else () )[last()]','VARCHAR(3)') AS 'EXPR' , x.y.value('(for $x in (//EXCH) return if ($x << . and $x) then $x else () )[last()]','VARCHAR(3)') AS 'EXCH' , x.y.value('(DATE/text())[1]','VARCHAR(30)') AS 'DATE' , x.y.value('(ASK/text())[1]','FLOAT') AS 'ASK' , x.y.value('(BID/text())[1]','FLOAT') AS 'BID' FROM @t.nodes('/RESPONSE') T(c) CROSS APPLY T.c.nodes('CONVERSION') AS x(y)
What the code is doing is it is using the nodes() method to get the XML and then using the CROSS APPLY operator, it is essentially creating a record for each <CONVERSION> node.
For nodes under <CONVERSION> it is really easy to get those values as shown in the query. However, to get the other values, I had to use a flwor statement to make sure that i retrieved the nodes immediately above the <CONVERSION> section that is in scope. What the flwor statements are doing is that they are getting a sequence of //EXPR nodes (or //EXCH nodes) that are positioned before the <CONVERSION> node in the XML. That is achieved by using the << comparison. Then with that sequence of nodes, it returns the last() one of that sequence. The last one would be the one that is nearest to the <CONVERSION> node.
Hope this helps.
|
|
|
|
|
Right there with Babe
      
Group: General Forum Members
Last Login: Today @ 5:29 PM
Points: 728,
Visits: 1,088
|
|
That works perfect! Thank you very much!
Tim
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Monday, May 20, 2013 9:04 AM
Points: 1,722,
Visits: 1,404
|
|
T_Peters (10/30/2012) That works perfect! Thank you very much!
Tim
No problem, thanks for the feedback.
I did notice that i left something unnecessary in the code from my testing 
You don't need the "and $x" in the flwor statements. So it should actually look like this:
if ($x << .) then $x else ()
|
|
|
|
|
Right there with Babe
      
Group: General Forum Members
Last Login: Today @ 5:29 PM
Points: 728,
Visits: 1,088
|
|
Great, thank you again for your help.
Tim
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 7:51 PM
Points: 32,910,
Visits: 26,800
|
|
Nice job, Arthur.
No reflection on your fine code. Just an observation about XML. I'm absolutely amazed at how expensive this type of XML processing actually is. The underlying table valued function does...
2 Full Cartesian Products (Square Joins) 2 Triangular Joins 2 Double Joins (2x # of rows) 6 Single Joins (1x # of rows)
The CPU time it took to process 16 "sets" (1 set per result row) of such data on my admittedly 10 year old desktop was 32ms. Now, even if that performance was linear (and it won't be), it would take 8.8 hours to import a lousy million rows where something like BCP importing a million row delimited file containing the same data takes well less than a minute.
Then, there's the impact on the pipe and the IO system. A million "rows" of this type of XML data (sans any unnecessary spaces and all end of line characters) is about 168 million characters. A delimited file with the same information AND 2 character end of line markers is only about 42 million characters or about one fourth as much IO.
I know I sound like a Luddite, but I'm just not seeing any advantage of using "flat" XML to transmit and import "flat" data.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 7:51 PM
Points: 32,910,
Visits: 26,800
|
|
Ah, yes... and to add insult to injury, if the dates and times were actually in a standard format, the delimited file would only contain 28 million bytes... that's only 1/6th the IO and pipe load of the XML.
That's just insane to me and I don't understand the draw that people have towards XML for such "flat" data transmission.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|
|
SSCommitted
      
Group: General Forum Members
Last Login: Monday, May 20, 2013 9:04 AM
Points: 1,722,
Visits: 1,404
|
|
Jeff Moden (10/31/2012) Nice job, Arthur.
No reflection on your fine code. Just an observation about XML. I'm absolutely amazed at how expensive this type of XML processing actually is.
Hi Jeff. Thanks for your comments. I totally agree with you regarding how expensive these types of queries are. For sure, the query I posted could be fine tuned, but it will still be much much more expensive then using a flat file for example. If any XML data is as flat as it is in the example, then XML isn't the best option to carry that data.
It does seem that to move data around these days, XML seems to be the default choice. Probably in part being driven by middle tier/app developers rather than DBA/Developers as you can serialize huge chunks of data to an xml file in about 5 lines of code (or less!). Give me flat files to import any day!
|
|
|
|
|
Right there with Babe
      
Group: General Forum Members
Last Login: Today @ 5:29 PM
Points: 728,
Visits: 1,088
|
|
Thanks guys. Unfortunately as is often the case I have to process the data with the format I have, not the format I might want or wish to have at a later time. I wouldn't choose XML, but that isn't up to me. Also on the bright side, it's daily exchange rate data and unless there is an explosion in new currencies, I don't think I'll be processing upwards of a million records.
Tim
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Today @ 7:51 PM
Points: 32,910,
Visits: 26,800
|
|
arthurolcot (11/1/2012)
Jeff Moden (10/31/2012) Nice job, Arthur.
No reflection on your fine code. Just an observation about XML. I'm absolutely amazed at how expensive this type of XML processing actually is.Hi Jeff. Thanks for your comments. I totally agree with you regarding how expensive these types of queries are. For sure, the query I posted could be fine tuned, but it will still be much much more expensive then using a flat file for example. If any XML data is as flat as it is in the example, then XML isn't the best option to carry that data. It does seem that to move data around these days, XML seems to be the default choice. Probably in part being driven by middle tier/app developers rather than DBA/Developers as you can serialize huge chunks of data to an xml file in about 5 lines of code (or less!). Give me flat files to import any day!
Nice to see a kindred spirit on the subject. Thanks for the feedback, Arthur.
--Jeff Moden "RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."
For better, quicker answers on T-SQL questions, click on the following... http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following... http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
|
|
|
|