XML Query

  • I have XML being passed into a stored procedure in the following format.  I'm struggling with how to shred it.

    declare @xml xml = convert(xml,N'<SearchQuery>
        <DealTypeDesc>Deal</DealTypeDesc>
        <VendorNum>1</VendorNum>
        <VendorName>Vendor1</VendorName>
        <VendorNum>2</VendorNum>
        <VendorName>Vendor2</VendorName>
        <VendorNum>3</VendorNum>
        <VendorName>Vendor3</VendorName>
        <VendorNum>4</VendorNum>
        <VendorName>Vendor4</VendorName>
        <VendorNum>5</VendorNum>
        <VendorName>Vendor5</VendorName>
    </SearchQuery>')

    -- this is how it is being consumed now.  1 element at a time
    SELECT t.c.value('text()[1]', 'NVARCHAR(MAX)') AS LookupValue
                FROM @xml.nodes('//SearchQuery/VendorNum') AS t(c)

    -- I want the results to look like this where I can use the node name as a column and the value as a lookup.  
    -- I wanted to do this in one pass without having to union all and select from the XML variable every time.
    -- I can't change the XML structure since it's coming from a third party, so that option is out.
    SELECT 'VendorName' ColumnName, t.c.value('text()[1]', 'NVARCHAR(MAX)') AS LookupValue
    FROM @xml.nodes('//SearchQuery/VendorName') AS t(c)
    union all
    SELECT 'VendorNum' ColumnName, t.c.value('text()[1]', 'NVARCHAR(MAX)') AS LookupValue
    FROM @xml.nodes('//SearchQuery/VendorNum') AS t(c)

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • That's a horrible XML schema.  Different nodes related simply by their physical order?  They can't provide Name and Number as attributes of a single node?

    I don't know if there is any guarantee about the order of rows returned by an XPATH function, but this might work:

    SELECT    nbr.VendorNumber, nam.VendorName
    FROM (
        SELECT    rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
                VendorNumber = n.value('.', 'int')
        FROM @xml.nodes('/SearchQuery/VendorNum') x(n)
    ) nbr
    INNER JOIN (
        SELECT    rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),
                VendorName = n.value('.', 'varchar(100)')
        FROM @xml.nodes('/SearchQuery/VendorName') x(n)
    ) nam
    ON nam.rn = nbr.rn

    The SQL language will only let you deal with sets.  If you read the XML with a C# program using XMLStreamReader, you could read the nodes one at a time in one pass and put the VendorName and VendorNumber values together.

  • Thanks Scott, but that's not what I was looking for.  I was essentially looking for Key Value pairs and using the node as the column name.  So it would look like below.  I'm not sure who designed the XML, but I agree with you

    ColumnName    LookupValue
    VendorName    Vendor1
    VendorName    Vendor2
    VendorName    Vendor3
    VendorName    Vendor4
    VendorName    Vendor5
    VendorNum    1
    VendorNum    2
    VendorNum    3
    VendorNum    4
    VendorNum    5

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Note that row order is not preserved.


    SELECT    ColumnName = n.value('local-name(.)', 'varchar(100)'),
            LookupValue = n.value('.', 'varchar(100)')

    FROM @xml.nodes('/SearchQuery/*') x(n)
    ORDER BY ColumnName

    ColumnName  LookupValue
    -------------- ----------------------------------------------------------------------------------------------------
    DealTypeDesc Deal
    VendorName  Vendor1
    VendorName  Vendor2
    VendorName  Vendor3
    VendorName  Vendor4
    VendorName  Vendor5
    VendorNum  5
    VendorNum  4
    VendorNum  3
    VendorNum  2
    VendorNum  1

  • If you actually want the name/number combos to retain some sort of sequencing you have to resort to some fairly horrible code to work around the limitations of XPath in SQL Server:


    Select
     Position, NodeName, NodeValue
    From
    (
     Select
      V.value('local-name(.)', 'varchar(100)') As NodeName,
      V.value('.', 'varchar(100)') As NodeValue,
      V.value('for $i in . return count(../*[. << $i]) + 1', 'int') As Position
     From @xml.nodes('/SearchQuery/*') N(V)
    ) X
    Order By Position

  • andycadley - Wednesday, August 29, 2018 12:01 PM

    If you actually want the name/number combos to retain some sort of sequencing you have to resort to some fairly horrible code to work around the limitations of XPath in SQL Server:


    Select
     Position, NodeName, NodeValue
    From
    (
     Select
      V.value('local-name(.)', 'varchar(100)') As NodeName,
      V.value('.', 'varchar(100)') As NodeValue,
      V.value('for $i in . return count(../*[. << $i]) + 1', 'int') As Position
     From @xml.nodes('/SearchQuery/*') N(V)
    ) X
    Order By Position

    Expanding on Andy's good code ...
    declare @xml xml = convert(xml,N'<SearchQuery>
      <DealTypeDesc>Deal</DealTypeDesc>
      <VendorNum>1</VendorNum>
      <VendorName>Vendor1</VendorName>
      <VendorNum>2</VendorNum>
      <VendorName>Vendor2</VendorName>
      <VendorNum>3</VendorNum>
      <VendorName>Vendor3</VendorName>
      <VendorNum>4</VendorNum>
      <VendorName>Vendor4</VendorName>
      <VendorNum>5</VendorNum>
      <VendorName>Vendor5</VendorName>
      <DealTypeDesc>Deal2</DealTypeDesc>
      <VendorNum>13</VendorNum>
      <VendorName>Vendor13</VendorName>
      <VendorNum>11</VendorNum>
      <VendorName>Vendor11</VendorName>
      <VendorNum>12</VendorNum>
      <VendorName>Vendor12</VendorName>
    </SearchQuery>');

    WITH cteShred AS (
      SELECT
       NodeName = N.V.value('local-name(.)', 'varchar(100)')
      , NodeValue = N.V.value('(./text())[1]', 'varchar(100)')
      , Position = N.V.value('for $i in . return count(../*[. << $i]) + 1', 'int')
      FROM @xml.nodes('/SearchQuery/*') N(V)
    )
    , cteBase AS (
      SELECT
       Position = CASE WHEN cs.NodeName = 'VendorNum' THEN cs.Position +1 ELSE cs.Position END
      , cs.NodeName
      , cs.NodeValue
      FROM cteShred As cs
    )
    SELECT
       cb.Position
      , DealtypeDesc = MAX(CASE WHEN cb.NodeName = 'DealTypeDesc' THEN cb.NodeValue END)
      , VendorNum  = MAX(CASE WHEN cb.NodeName = 'VendorNum'  THEN cb.NodeValue END)
      , VendorName = MAX(CASE WHEN cb.NodeName = 'VendorName' THEN cb.NodeValue END)
    FROM cteBase AS cb
    GROUP BY cb.Position
    ORDER BY cb.Position;

    -- * The "for $i in ." clause defines a variable named $i that contains the current node (.). This is basically a hack to work around XQuery's lack of an XSLT-like current() function.
    -- * The ../* expression selects all siblings (children of the parent) of the current node.
    -- * The [. << $i] predicate filters the list of siblings to those that precede (<<) the current node ($i). This is exponentially slow, as it re-counts all teh sibligs at for EVERY row.
    -- * We count() the number of preceding siblings and then add 1 to get the position. That way the first node (which has no preceding siblings) is assigned a position of 1.

  • andycadley - Wednesday, August 29, 2018 12:01 PM

    If you actually want the name/number combos to retain some sort of sequencing you have to resort to some fairly horrible code to work around the limitations of XPath in SQL Server:


    Select
     Position, NodeName, NodeValue
    From
    (
     Select
      V.value('local-name(.)', 'varchar(100)') As NodeName,
      V.value('.', 'varchar(100)') As NodeValue,
      V.value('for $i in . return count(../*[. << $i]) + 1', 'int') As Position
     From @xml.nodes('/SearchQuery/*') N(V)
    ) X
    Order By Position

    Actually, there is an easier method.

    Select
      V.value('local-name(.)', 'varchar(100)') As NodeName,
      V.value('.', 'varchar(100)') As NodeValue,
    ROW_NUMBER() OVER(ORDER BY V) Position
     From @xml.nodes('/SearchQuery/) N(V)]

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • And if one wants to take it further, using Drew's fine code as the basis, what about this?
    DECLARE @xml AS xml =
        CONVERT(xml,
    N'<SearchQuery>
      <DealTypeDesc>Deal</DealTypeDesc>
      <VendorNum>1</VendorNum>
      <VendorName>Vendor1</VendorName>
      <VendorNum>2</VendorNum>
      <VendorName>Vendor2</VendorName>
      <VendorNum>3</VendorNum>
      <VendorName>Vendor3</VendorName>
      <VendorNum>4</VendorNum>
      <VendorName>Vendor4</VendorName>
      <VendorNum>5</VendorNum>
      <VendorName>Vendor5</VendorName>
    </SearchQuery>');

    WITH RAW_DATA AS (

        SELECT
            V.value('local-name(.)', 'varchar(100)') AS NodeName,
            V.value('.', 'varchar(100)') AS NodeValue,
            ROW_NUMBER() OVER(ORDER BY V) AS Position
        FROM @xml.nodes('/SearchQuery/*') AS N (V)
    ),
        GROUPED_DATA AS (

            SELECT
                RD.NodeName,
                RD.NodeValue,
                RD.Position,
                (ROW_NUMBER() OVER(ORDER BY RD.Position) + 1) / 2 AS Grp
            FROM RAW_DATA AS RD
            WHERE RD.Position > 1
    )
    SELECT
        (SELECT NodeValue FROM RAW_DATA WHERE Position = 1) AS Deal,
        MAX(CASE WHEN G.Position % 2 = 0 THEN G.NodeValue END) AS VendorNum,
        MAX(CASE WHEN G.Position % 2 = 1 THEN G.NodeValue END) AS VendorName
    FROM GROUPED_DATA AS G
    GROUP BY
        G.Grp
    ORDER BY G.Grp;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Row Number in this case is not guaranteed to return the order nodes were in the XML, you have to use XQuery to get that.

  • andycadley - Thursday, August 30, 2018 9:44 AM

    Row Number in this case is not guaranteed to return the order nodes were in the XML, you have to use XQuery to get that.

    Okay, so is Drew's query going to have that problem with his ROW_NUMBER function?   If it does, then so does mine, because I rely on his...

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • sgmunson - Thursday, August 30, 2018 10:05 AM

    andycadley - Thursday, August 30, 2018 9:44 AM

    Row Number in this case is not guaranteed to return the order nodes were in the XML, you have to use XQuery to get that.

    Okay, so is Drew's query going to have that problem with his ROW_NUMBER function?   If it does, then so does mine, because I rely on his...

    Unfortunately yes. You really do have to do it with some fairly nasty XQuery/XPath stuff if you want the correct ordering to be guaranteed. It wouldn't be quite so bad if SQL Server supported the full range of functionality but it's particularly limited when it comes to ordering because a lot of the functionality that underpins the support of XML ignores the concept of a fixed order (hardly surprising because SQL generally does)

  • Just to close the loop.... I wasn't worried about order.  I just needed to get it in key/value pairs, so I could use it as a lookup.  Thanks for all your help

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Mike01 - Thursday, August 30, 2018 12:13 PM

    Just to close the loop.... I wasn't worried about order.  I just needed to get it in key/value pairs, so I could use it as a lookup.  Thanks for all your help

    To get the pairs you need to make sure they are built from sequential entries in the XML file.
    To make sure you're taking sequential entries you need to assure the order of values.
    Without the ordrer of entries assured you may wery well bind VendorNum "1" to VendorName "Vendor 3" and VendorNum "4" to vendorName "Vendor 1"

    _____________
    Code for TallyGenerator

  • not in my case.  These are being passed into a proc and the proc is returning Vendors that match the VendorNum or VendorName, order doesn't matter

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply