Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Parsing huge XML file into a database table using sql server Expand / Collapse
Author
Message
Posted Friday, May 31, 2013 1:13 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, June 03, 2013 10:21 AM
Points: 7, Visits: 23
I tried to parse a huge xml file using sql server (OPENROWSET) into a database table, but got an error that says duplicate attributes (author). The author column varies for each article, ranging from one author to upto six authors. The columns I am really interested in are author, title and journal. Below is a snippet of the xml data. Any suggestions will be highly appreciated. Thanks in advance.

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE dblp SYSTEM "dblp.dtd">
<dblp>
<article mdate="2002-01-03" key="persons/CoddD74">
<author>E. F. Codd</author>
<author>C. J. Date</author>
<title>Interactive Support for Non-Programmers: The Relational and Network Approaches.</title>
<journal>IBM Research Report, San Jose, California</journal>
<volume>RJ1400</volume>
<month>June</month>
<year>1974</year>
</article>
</dblp>
Post #1458538
Posted Friday, May 31, 2013 2:54 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 8:36 AM
Points: 2,419, Visits: 1,565
Hi... Have you tried something like below.. This works fine with your example xml:

DECLARE @xml XML
SELECT @xml = CONVERT(XML, (SELECT * FROM OPENROWSET(BULK N'c:\testxml.xml', SINGLE_BLOB) AS xmldocument), 2)

SELECT t.c.value('(./text())[1]', 'varchar(30)')
FROM @xml.nodes('//author') AS T(c)


Post #1458560
Posted Friday, May 31, 2013 6:02 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, June 03, 2013 10:21 AM
Points: 7, Visits: 23
Thanks for the response, it worked, but I would like the output to be in the formatt below. All the authors of a particular article should be together in a column, separate by comma, and the title as well as the journal should occupy the next two columns. The authors of a particular article, title and journal should be just one record. Thanks in advance for your suggestions.

Author (s)
E F Cod, Patric A V Hall, Markus Tresch
Post #1458627
Posted Friday, May 31, 2013 7:13 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 8:36 AM
Points: 2,419, Visits: 1,565
I wrote a blog on something similar a while back: http://www.olcot.co.uk/sql-blogs/using-xquery-to-create-a-single-xml-node-with-comma-separated-string-from-multiple-xml-nodes

here is one of a few ways to comma separate the node author for you:

DECLARE @xml XML
SELECT @xml = CONVERT(XML, (SELECT * FROM OPENROWSET(BULK N'c:\testxml.xml', SINGLE_BLOB) AS xmldocument), 2)

SELECT t.c.query('for $x in author
return
if (not( ($x) is (author[last()])[1] )) then
concat($x/text()[1], ",")
else
string($x/text()[1])').value('.', 'varchar(200)')
, t.c.value('(title/text())[1]', 'varchar(150)') AS 'title'
, t.c.value('(journal/text())[1]', 'varchar(100)') AS 'journal'
FROM @xml.nodes('//article') AS T(c)

Post #1458659
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse