Importing Strange XML File

Question

Importing Strange XML File

Viewing 3 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply

SQLRNNR SSC Guru Points: 281344 More actions · Answer 1

Mark Derryberry (3/1/2011)
Over 50% from my initial attempts! I was pushing about 6 hours and the methods you all proposed and some additional things I had to do given the scope. It's down to about 2 1/2 on a server. I say server because my desktop dev machine almost choked to death. 🙂

Excellent.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

LutzM SSC Guru Points: 107049 More actions · Answer 2

Mark Derryberry (3/1/2011)
Over 50% from my initial attempts! I was pushing about 6 hours and the methods you all proposed and some additional things I had to do given the scope. It's down to about 2 1/2 on a server. I say server because my desktop dev machine almost choked to death. 🙂

I didn't know about the size of the file and I didn't ask for it either...

Given this scenario I would try a slightly different approach:

Step 1: load the xml file into a staging table with an xml column using BULK IMPORT

Step 2: add an xml index

Step 3: shred the xml data into another staging table (based on the "cte" part of the code provided so far)

Step 4: index the shredded staging table

Step 5: split the shredded data based on the "split" part of the code, most probably into yet another staging table including adding an index

Step 6: final select using the CrossTab method instead of Pivot (since it has been shown that Pivot is slower than the CrossTab - but I don't have the reference available right now)

An alternative route instead of using steps 1, 2 and 3 could be using OPENXML:

DECLARE @idoc int

DECLARE @doc varchar(max)

SET @doc =YourLargeXMLFile

--Create an internal representation of the XML document.

EXEC sp_xml_preparedocument @idoc OUTPUT, @doc

-- Execute a SELECT statement that uses the OPENXML rowset provider.

SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1) ) AS row,*

FROM OPENXML (@idoc, '/wcsxml/relation/row',1)

WITH (col varchar(200) '../@columns',

val varchar(200) '@values')

I would recommend to compare the performance of each method individually (not testing the whole process at once) to see which one would perform better.

Since both will return the identical result you could decide which one to use.

Based on the given scenario I'd guess the OPENXML approach might be faster...

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

Mark Derryberry SSC Enthusiast Points: 124 More actions · Answer 3

Thanks for the great suggestions.

I hate it when I get so wrapped around the axel I overlook some other avenues. I'm already doing some of that when I begin importing the data so I'll just expand on the idea.