Shred XML column using SSIS

  • arun1_m1

    SSCommitted

    Points: 1949

    Comments posted to this topic are about the item Shred XML column using SSIS

  • Samuel Vella

    SSCrazy Eights

    Points: 8055

    Hi Arun

    Nice walk through on how to use SSIS for shredding XML sources, I usually prefer to use SQL (Example below) but as it's not always possible to use ones preferred solution its good to have alternatives

    CREATE TABLE [dbo].[TestXML](

    [id] [int] IDENTITY(1,1) NOT NULL,

    [computed_column] [xml] NULL

    ) ON [PRIMARY]

    GO

    INSERT INTO [dbo].[TestXML]

    ([computed_column])

    VALUES

    (' <x:books xmlns:x="urn:books">

    <book id="743567">

    <author>Arun Mishra</author>

    <title>The First Article to SSC</title>

    <genre>Fiction</genre>

    <price>100.00</price>

    <pub_date>2013-01-28</pub_date>

    <review>Interesting book</review>

    </book>

    </x:books>')

    SELECT S.ID as TableID,

    ID = book.value('(@id)','int'),

    author = book.value('(author)[1]', 'varchar(100)'),

    title = book.value('(title)[1]', 'varchar(100)'),

    genre = book.value('(genre)[1]', 'varchar(100)'),

    price = book.value('(price)[1]', 'money'),

    pub_date = book.value('(pub_date)[1]', 'datetime'),

    review = book.value('(review)[1]', 'varchar(1000)')

    FROM

    TestXML S

    CROSS APPLY computed_column.nodes('/*:books/book') AS Tbl(book)

    INSERT INTO [dbo].[TestXML]

    ([computed_column])

    VALUES

    (' <x:books xmlns:x="urn:books">

    <book id="743568">

    <author>Sam Vella</author>

    <title>A reply to the article</title>

    <genre>Specialist</genre>

    <price>0.99</price>

    <pub_date>2013-01-28</pub_date>

    <review>A good lunch hour read</review>

    </book>

    <book id="743569">

    <author>Philip K Dick</author>

    <title>Do Androids Dream of Electric Sheep</title>

    <genre>Fiction</genre>

    <price>5.99</price>

    <pub_date>1968-01-01</pub_date>

    <review>An even better read</review>

    </book>

    </x:books>')

    SELECT S.ID as TableID,

    ID = book.value('(@id)','int'),

    author = book.value('(author)[1]', 'varchar(100)'),

    title = book.value('(title)[1]', 'varchar(100)'),

    genre = book.value('(genre)[1]', 'varchar(100)'),

    price = book.value('(price)[1]', 'money'),

    pub_date = book.value('(pub_date)[1]', 'datetime'),

    review = book.value('(review)[1]', 'varchar(1000)')

    FROM

    TestXML S

    CROSS APPLY computed_column.nodes('/*:books/book') AS Tbl(book)

  • phleduc

    Grasshopper

    Points: 20

    I am not able to get this to work, setting XML data from variable in XML source editor, variable name vXMLdata, either using "inline schema" or try to generate XSD, getting error "data at root level is invalid, line 1, position 1, is it possible to make a complete package available so I can try to figure it out?

  • JAMESC2003

    SSC Enthusiast

    Points: 138

    I would also like to download the package as I am a newbie

    in the SSIS world. Thanks, James C

  • fregatepallada

    SSC-Addicted

    Points: 407

    I would personally use XQuery - a first class citizen in T-SQL

    IMHO loading XML documents is quite memory-hungry.

    Cheers

  • JAMESC2003

    SSC Enthusiast

    Points: 138

    What would the XQuery option buy you as far as process a series of

    flat files ?

  • fregatepallada

    SSC-Addicted

    Points: 407

    JAMESC2003 (2/27/2013)


    What would the XQuery option buy you as far as process a series of

    flat files ?

    AFTER you pass XML into Stored Procedure. Plus XML native data type in SQL server allowed you to have a STRONGLY TYPED XML parameter, bound by XML schema.

  • itis4junk

    SSC Veteran

    Points: 202

    I'm using SSDT 2010 that "Data Conversion" task in Data Flow keeps changing all input columns to "Unicode text stream [DT_NTEXT]" datatype; therefore, the package fails to populate data in destination table.

    Thanks,

    Khanh

  • SathyanarrayananS

    Grasshopper

    Points: 21

    Hi ,

    Can you explain ,more on what you have done inside DataFlow task.

  • japa62

    SSC Rookie

    Points: 40

    If none of the above benefits of XQuery were involved (and doing the best NOT to log on to work for an investigation), what are the performance differences between the SSIS method and XQuery?

  • Christine Parsons

    Grasshopper

    Points: 20

    Yes, what goes on in 'Convert the XML Data'?

  • Clayton Hoyt

    SSC Journeyman

    Points: 82

    Agreed. This won't work for me for the same reason as a couple of the above.

  • Bunter58

    SSC Enthusiast

    Points: 116

    Agree with others struggling with the XML Source step in the data flow. I think that for the sake of us new to XML, it would have been nice to explain how you set-up the XML step. Perhaps once that's working, it is obvious what to do in the Data Conversion step, too, but a screen shot would make this a lot easier to follow.

    I've tried pasting the XML into the xmlddata variable so I could generate an XSD from that but it rejected that as having too many name spaces. I could only see one namespace so that's obviously not the way it was done.

  • kwasid

    Valued Member

    Points: 52

    What SSIS transformation is being called in the first step of the data flow step ?

    Is it a call to the XML Source, as this does not work, get undeclared XML errors ?

    Can you please give me an answer.

    Thanks,

    Kwasi

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply