May 31, 2013 at 1:13 am
I tried to parse a huge xml file using sql server (OPENROWSET) into a database table, but got an error that says duplicate attributes (author). The author column varies for each article, ranging from one author to upto six authors. The columns I am really interested in are author, title and journal. Below is a snippet of the xml data. Any suggestions will be highly appreciated. Thanks in advance.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE dblp SYSTEM "dblp.dtd">
<dblp>
<article mdate="2002-01-03" key="persons/CoddD74">
<author>E. F. Codd</author>
<author>C. J. Date</author>
<title>Interactive Support for Non-Programmers: The Relational and Network Approaches.</title>
<journal>IBM Research Report, San Jose, California</journal>
<volume>RJ1400</volume>
<month>June</month>
<year>1974</year>
</article>
</dblp>
May 31, 2013 at 2:54 am
Hi... Have you tried something like below.. This works fine with your example xml:
DECLARE @xml XML
SELECT @xml = CONVERT(XML, (SELECT * FROM OPENROWSET(BULK N'c:\testxml.xml', SINGLE_BLOB) AS xmldocument), 2)
SELECT t.c.value('(./text())[1]', 'varchar(30)')
FROM @xml.nodes('//author') AS T(c)
May 31, 2013 at 6:02 am
Thanks for the response, it worked, but I would like the output to be in the formatt below. All the authors of a particular article should be together in a column, separate by comma, and the title as well as the journal should occupy the next two columns. The authors of a particular article, title and journal should be just one record. Thanks in advance for your suggestions.
Author (s)
E F Cod, Patric A V Hall, Markus Tresch
May 31, 2013 at 7:13 am
I wrote a blog on something similar a while back: http://www.olcot.co.uk/sql-blogs/using-xquery-to-create-a-single-xml-node-with-comma-separated-string-from-multiple-xml-nodes
here is one of a few ways to comma separate the node author for you:
DECLARE @xml XML
SELECT @xml = CONVERT(XML, (SELECT * FROM OPENROWSET(BULK N'c:\testxml.xml', SINGLE_BLOB) AS xmldocument), 2)
SELECT t.c.query('for $x in author
return
if (not( ($x) is (author[last()])[1] )) then
concat($x/text()[1], ",")
else
string($x/text()[1])').value('.', 'varchar(200)')
, t.c.value('(title/text())[1]', 'varchar(150)') AS 'title'
, t.c.value('(journal/text())[1]', 'varchar(100)') AS 'journal'
FROM @xml.nodes('//article') AS T(c)
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy