Using T-SQL to Transform XML Data to a Relational Format

  • Comments posted to this topic are about the item Using T-SQL to Transform XML Data to a Relational Format

  • How about using OPENXML?

    DECLARE @idoc int, @myXml xml

    SET @myXml = '<Author xmlns:xsi="..." xmlns:xsd="..." version="1">

    <Benefits>

    <Benefit Name="HospitalizationLimit" Value="500000"/>

    <Benefit Name="MedicalConsultationLimit" Value="12000"/>

    <Benefit Name="LifeInsuranceCoverage" Value="1000000"/>

    <Benefit Name="MonthlyTelephone" Value ="5000"/>

    <Benefit Name="DependentMemberCount" Value="5"/>

    ...

    </Benefits>

    </Author>'

    EXEC sp_xml_preparedocument @idoc OUTPUT, @myXml

    SELECT *

    FROM OPENXML ( @idoc, '/Author/Benefits', 2 )

    WITH (

    HospitilizationLimit int './Benefit[@Name="HospitalizationLimit"]/@Value',

    MedicalConsultationLimit int './Benefit[@Name="MedicalConsultationLimit"]/@Value'

    )

    EXEC sp_xml_removedocument @idoc

    I find OPENXML to be fairly useful when translating XML to table data.

    Hope this helps!

    -Graham

  • I'm glad you brought up OPENXML Graham. In the past I've used both methods (XQUERY, OPENXML) to process XML data.

    MSDN suggests using OPENXML statements with XML data columns. But I can't see how this is possible and how it could be efficient.

    If I have an XML column with XML indexes created on it would I want to pass column values to a sp_xml_preparedocument stored procedure and then use OPENXML statements?

  • Gary,

    I would think XML indexes are only useful across multiple rows, like when you want to find rows that contain certain data in the XML or when the XML column would appear in the WHERE clause of a query. If you are processing row-by-row, OPENXML would still be a good option.

    Anyone have something else to add?

    -Graham

  • Graham O'Daniel (12/30/2009)


    How about using OPENXML?

    I find OPENXML to be fairly useful when translating XML to table data.

    Hope this helps!

    -Graham

    NO! See: Stop Using OPENXML

    http://www.sqlservercentral.com/blogs/michael_coles/archive/2008/01/20/stop-using-openxml-please.aspx


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Thanks for your post Mauve, it completely clears up what I suspected.

    I've created a data feed that archives thoasands of XML files from a web-service, stores them in an indexed xml column and then shreds the data into complex relational hierarchies. The whole thing runs very quickly as a service on the production server. I think using the OPENXML technique will drain the resources and make other services on the server suffer.

  • Any idea on how to do the same for a nested xml structure (e.g. a bill-of-material).

  • Using OPENXML vs. .nodes is a trade off situation, each with good use cases.

    You have to start by asking yourself what you want: performance or resource constraints. If you want performance and don't care about resources then OPENXML is your choice. If you are concerned about resource usage on the server then .nodes is your choice.

    I wouldn't suggest one way over the other for all situations.

    Please see:

    http://sql-server-performance.com/Community/forums/p/25212/147146.aspx

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=93759

    http://www.windows-tech.info/15/0a9fd8b0e4b36ccc.php

  • @mike-2: provide me an example of your XML and what you want, and I can write it up for you in both OPENXML and .nodes methods.

    -Graham

  • Below an example of the source xml I mean.

    <Products>

    <Product ID="123" Name="Male Bicycle" Amount="1">

    <Product ID="988" Name="Male Frame" Amount="1"/>

    <Product ID="501" Name="Wheel" Amount="2">

    <Product ID="450" Name="Spoke" Amount="50"/>

    <Product ID="490" Name="Rim" Amount="1"/>

    </Product>

    ...

    </Product>

    <Product ID="234" Name="Female Bicycle" Amount="1">

    <Product ID="998" Name="Female Frame" Amount="1"/>

    <Product ID="501" Name="Wheel" Amount="2">

    <Product ID="450" Name="Spoke" Amount="50"/>

    <Product ID="490" Name="Rim" Amount="1"/>

    </Product>

    ...

    </Product>

    </Products>

    I would like to translate to two tables:

    Table 1 = Products:

    - Product ID

    - Product Name

    This table will only contain the distinct records from the xml. So product "Rim" will only appear once even though it has been defined for 20 bicycles.

    Table 2 = BOM:

    - Parent Product ID

    - Component Product ID

    - Amount

  • Mike, this should do the trick. OPENXML method first followed by the .nodes() method.

    declare @xml xml

    set @xml = '<Products>

    <Product ID="123" Name="Male Bicycle" Amount="1">

    <Product ID="988" Name="Male Frame" Amount="1"/>

    <Product ID="501" Name="Wheel" Amount="2">

    <Product ID="450" Name="Spoke" Amount="50"/>

    <Product ID="490" Name="Rim" Amount="1"/>

    </Product>

    </Product>

    <Product ID="234" Name="Female Bicycle" Amount="1">

    <Product ID="998" Name="Female Frame" Amount="1"/>

    <Product ID="501" Name="Wheel" Amount="2">

    <Product ID="450" Name="Spoke" Amount="50"/>

    <Product ID="490" Name="Rim" Amount="1"/>

    </Product>

    </Product>

    </Products>'

    declare @Products table ( ProductID int, ProductName varchar(50) )

    declare @BOM table ( ParentProductID int, ComponentProductID int, Amount int )

    declare @idoc int

    exec sp_xml_preparedocument @idoc output, @xml

    insert into @Products ( ProductID, ProductName )

    select distinct

    ID

    , Name

    from OPENXML ( @idoc, '//Product', 2 )

    with (

    ID int '@ID'

    , Name varchar(50) '@Name'

    )

    insert into @BOM ( ParentProductID, ComponentProductID, Amount )

    select

    ParentProductID

    , ComponentProductID

    , Amount

    from OPENXML ( @idoc, '//Product', 2 )

    with (

    ParentProductID int '../@ID'

    , ComponentProductID int '@ID'

    , Amount int '@Amount'

    )

    exec sp_xml_removedocument @idoc

    select * from @Products

    select * from @BOM

    delete @Products

    delete @BOM

    insert into @Products ( ProductID, ProductName )

    select distinct

    node.value ( '(@ID)[1]', 'int' )

    , node.value ( '(@Name)[1]', 'varchar(50)' )

    from @xml.nodes ( '//Product' ) p(node)

    insert into @BOM ( ParentProductID, ComponentProductID, Amount )

    select

    node.value ( '(../@ID)[1]', 'int' )

    , node.value ( '(@ID)[1]', 'int' )

    , node.value ( '(@Amount)[1]', 'int' )

    from @xml.nodes ( '//Product' ) p(node)

    select * from @Products

    select * from @BOM

  • Thanks for all of this it was exactly what I was trying to solve. I'm using the XQUERY and have all good results except one particular node format.

    The node looks like:

    <ProductID ProductNo="100086861"></ProductID>

    I'd be happy to get the full string but really only need that number.

    Using the XQUERY approach can that be done?

    Thanks for your time.

    P

  • paul.morgan-1069605 (11/2/2011)


    Thanks for all of this it was exactly what I was trying to solve. I'm using the XQUERY and have all good results except one particular node format.

    The node looks like:

    <ProductID ProductNo="100086861"></ProductID>

    I'd be happy to get the full string but really only need that number.

    Using the XQUERY approach can that be done?

    Thanks for your time.

    P

    (Using SQL Server 2008 R2)

    DECLARE @xml xml = '<ProductID ProductNo="100086861"></ProductID>';

    SELECT @xml.value('/ProductID[1]/@ProductNo', 'int');


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply