Parsing XML Blob

  • I am having problem trying to parse an XML field in a table because it is too complex for me. I need to get the data from the <HighRiskIndicators>, <RiskCode>, and <Description> fields to be returned individually rather than as a long string. I can write a query to get all the data from the Results to return as one long string, but I need these fields to return individually. I am not very experienced with querying XML data so any help would be greatly appreciated. I have attached the XML I am trying to query against. Thanks for the help.

  • elfresco1 (3/26/2014)


    I am having problem trying to parse an XML field in a table because it is too complex for me. I need to get the data from the <HighRiskIndicators>, <RiskCode>, and <Description> fields to be returned individually rather than as a long string. I can write a query to get all the data from the Results to return as one long string, but I need these fields to return individually. I am not very experienced with querying XML data so any help would be greatly appreciated. I have attached the XML I am trying to query against. Thanks for the help.

    This should get you started;

    DECLARE @TXTXML XML = N'<InstantIDModelResponse

    xmlns:xsd="http://www.w3.org/2001/XMLSchema"

    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <Header xmlns="http://webservices.seisint.com/WsIdentity">

    <Status>0</Status>

    <QueryId>12345</QueryId>

    <TransactionId>987654321</TransactionId>

    </Header>

    <Result xmlns="http://webservices.seisint.com/WsIdentity">

    <InputEcho>

    <Name>

    <First>John</First>

    <Last>Doe</Last>

    </Name>

    <Address>

    <StreetAddress1>123 Main St</StreetAddress1>

    <City>Hometown</City>

    <State>NY</State>

    <Zip5>12354</Zip5>

    </Address>

    <DOB>

    <Year>1900</Year>

    <Month>01</Month>

    <Day>02</Day>

    </DOB>

    <Age>114</Age>

    <SSN>111111111</SSN>

    <HomePhone>2222222222</HomePhone>

    </InputEcho>

    <Models>

    <Model>

    <Name>ConsumerVerificationIndex</Name>

    <Scores>

    <Score>

    <Type>nas</Type>

    <Value>8</Value>

    <HighRiskIndicators>

    <HighRiskIndicator>

    <RiskCode>03</RiskCode>

    <Description>The input SSN was issued prior to the input date-of-birth</Description>

    <Sequence>1</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>26</RiskCode>

    <Description>Unable to verify SSN/TIN</Description>

    <Sequence>2</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>27</RiskCode>

    <Description>Unable to verify phone number</Description>

    <Sequence>3</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>52</RiskCode>

    <Description>The input first name is not associated with input SSN</Description>

    <Sequence>4</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>10</RiskCode>

    <Description>The input phone number is a mobile number</Description>

    <Sequence>5</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>82</RiskCode>

    <Description>The input name and address return a different phone number</Description>

    <Sequence>6</Sequence>

    </HighRiskIndicator>

    </HighRiskIndicators>

    </Score>

    </Scores>

    </Model>

    </Models>

    <DOBVerified>true</DOBVerified>

    </Result>

    </InstantIDModelResponse>'

    ;WITH XMLNAMESPACES ( 'http://www.w3.org/2001/XMLSchema-instance' AS xsi

    ,'http://webservices.seisint.com/WsIdentity' AS ns1)

    SELECT

    HEA.DER.value('./ns1:QueryId[1]' ,'INT' ) AS QueryId

    ,HEA.DER.value('./ns1:TransactionId[1]' ,'INT' ) AS TransactionId

    ,HRI.CT.value('./ns1:Sequence[1]' ,'NVARCHAR(12)' ) AS Sequence

    ,HRI.CT.value('./ns1:RiskCode[1]' ,'NVARCHAR(12)' ) AS RiskCode

    ,HRI.CT.value('./ns1:Description[1]' ,'NVARCHAR(1024)' ) AS Description

    FROM @TXTXML.nodes('InstantIDModelResponse') AS RN(ODE)

    OUTER APPLY RN.ODE.nodes('ns1:Header') AS HEA(DER)

    OUTER APPLY RN.ODE.nodes('

    ns1:Result/ns1:Models/ns1:Model/ns1:Scores/ns1:Score/ns1:HighRiskIndicators/ns1:HighRiskIndicator'

    ) AS HRI(CT)

    The results

    QueryId TransactionId Sequence RiskCode Description

    -------- ------------- --------- --------- -----------------------------------------------------------

    12345 987654321 1 03 The input SSN was issued prior to the input date-of-birth

    12345 987654321 2 26 Unable to verify SSN/TIN

    12345 987654321 3 27 Unable to verify phone number

    12345 987654321 4 52 The input first name is not associated with input SSN

    12345 987654321 5 10 The input phone number is a mobile number

    12345 987654321 6 82 The input name and address return a different phone number

  • Eirikur Eiriksson (3/29/2014)


    elfresco1 (3/26/2014)


    I am having problem trying to parse an XML field in a table because it is too complex for me. I need to get the data from the <HighRiskIndicators>, <RiskCode>, and <Description> fields to be returned individually rather than as a long string. I can write a query to get all the data from the Results to return as one long string, but I need these fields to return individually. I am not very experienced with querying XML data so any help would be greatly appreciated. I have attached the XML I am trying to query against. Thanks for the help.

    This should get you started;

    DECLARE @TXTXML XML = N'<InstantIDModelResponse

    xmlns:xsd="http://www.w3.org/2001/XMLSchema"

    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

    <Header xmlns="http://webservices.seisint.com/WsIdentity">

    <Status>0</Status>

    <QueryId>12345</QueryId>

    <TransactionId>987654321</TransactionId>

    </Header>

    <Result xmlns="http://webservices.seisint.com/WsIdentity">

    <InputEcho>

    <Name>

    <First>John</First>

    <Last>Doe</Last>

    </Name>

    <Address>

    <StreetAddress1>123 Main St</StreetAddress1>

    <City>Hometown</City>

    <State>NY</State>

    <Zip5>12354</Zip5>

    </Address>

    <DOB>

    <Year>1900</Year>

    <Month>01</Month>

    <Day>02</Day>

    </DOB>

    <Age>114</Age>

    <SSN>111111111</SSN>

    <HomePhone>2222222222</HomePhone>

    </InputEcho>

    <Models>

    <Model>

    <Name>ConsumerVerificationIndex</Name>

    <Scores>

    <Score>

    <Type>nas</Type>

    <Value>8</Value>

    <HighRiskIndicators>

    <HighRiskIndicator>

    <RiskCode>03</RiskCode>

    <Description>The input SSN was issued prior to the input date-of-birth</Description>

    <Sequence>1</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>26</RiskCode>

    <Description>Unable to verify SSN/TIN</Description>

    <Sequence>2</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>27</RiskCode>

    <Description>Unable to verify phone number</Description>

    <Sequence>3</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>52</RiskCode>

    <Description>The input first name is not associated with input SSN</Description>

    <Sequence>4</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>10</RiskCode>

    <Description>The input phone number is a mobile number</Description>

    <Sequence>5</Sequence>

    </HighRiskIndicator>

    <HighRiskIndicator>

    <RiskCode>82</RiskCode>

    <Description>The input name and address return a different phone number</Description>

    <Sequence>6</Sequence>

    </HighRiskIndicator>

    </HighRiskIndicators>

    </Score>

    </Scores>

    </Model>

    </Models>

    <DOBVerified>true</DOBVerified>

    </Result>

    </InstantIDModelResponse>'

    ;WITH XMLNAMESPACES ( 'http://www.w3.org/2001/XMLSchema-instance' AS xsi

    ,'http://webservices.seisint.com/WsIdentity' AS ns1)

    SELECT

    HEA.DER.value('./ns1:QueryId[1]' ,'INT' ) AS QueryId

    ,HEA.DER.value('./ns1:TransactionId[1]' ,'INT' ) AS TransactionId

    ,HRI.CT.value('./ns1:Sequence[1]' ,'NVARCHAR(12)' ) AS Sequence

    ,HRI.CT.value('./ns1:RiskCode[1]' ,'NVARCHAR(12)' ) AS RiskCode

    ,HRI.CT.value('./ns1:Description[1]' ,'NVARCHAR(1024)' ) AS Description

    FROM @TXTXML.nodes('InstantIDModelResponse') AS RN(ODE)

    OUTER APPLY RN.ODE.nodes('ns1:Header') AS HEA(DER)

    OUTER APPLY RN.ODE.nodes('

    ns1:Result/ns1:Models/ns1:Model/ns1:Scores/ns1:Score/ns1:HighRiskIndicators/ns1:HighRiskIndicator'

    ) AS HRI(CT)

    The results

    QueryId TransactionId Sequence RiskCode Description

    -------- ------------- --------- --------- -----------------------------------------------------------

    12345 987654321 1 03 The input SSN was issued prior to the input date-of-birth

    12345 987654321 2 26 Unable to verify SSN/TIN

    12345 987654321 3 27 Unable to verify phone number

    12345 987654321 4 52 The input first name is not associated with input SSN

    12345 987654321 5 10 The input phone number is a mobile number

    12345 987654321 6 82 The input name and address return a different phone number

    Hi Eirikur,

    I'm terrible at XML. I understand the hierarchical structure of XML (that part is simple) but I don't understand the parsing aspects very well.

    I notice that you did (what I think is) a nifty trick in the HRI(CT) CROSS APPLY that greatly simplified what I thought one might have to do and I have to admit I've never seen such a thing before. I think that most of the articles on the web and in Books Online are terrible so I have to ask you, is there a good book that you learned from on how to parse XML in such a simple and elegant manner that you could recommend?

    I also have to ask, did you have to manually build the parsing code by observation of the hierarchical structure or is there a tool that you used to blast through this so nicely?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/29/2014)


    I notice that you did (what I think is) a nifty trick in the HRI(CT) CROSS APPLY that greatly simplified what I thought one might have to do and I have to admit I've never seen such a thing before. I think that most of the articles on the web and in Books Online are terrible so I have to ask you, is there a good book that you learned from on how to parse XML in such a simple and elegant manner that you could recommend?

    I also have to ask, did you have to manually build the parsing code by observation of the hierarchical structure or is there a tool that you used to blast through this so nicely?

    Hi Jeff,

    you are absolutely right, there aren't many Sql Server oriented resources on XML. In fact, I have only one book on the subject, Pro SQL Server 2008 XML by Michael Coles, Apress. A good overview of the XML technology in SQL Server but like many others, limited in some sense when it comes to more practical applications. Seth Delconte has written some good articles on mssqltips.com and simple-talk.com, one of the best being Ad-Hoc XML File Querying[/url].

    The nodes() function is a key here, it “translates” each element to a “tabular” form. The CROSS APPLY and OUTER APPLY are acting like joins, inner and outer respectfully. As each nodes() function returns a virtual table, TABLE(COLUMN), the trick is to go down the structure to an attribute level, equivalent of a tabular row (avoiding Cartesian product). This makes the syntax easy to apply and understand. It also mitigates the singularity requirements by only having to define the first [1] instance of an element/attribute as there will be no others.

    I don't use any tools for writing these queries, find them rather more getting in the way than helping. A quick trick is to use .query('(.)') to display the content of each node, makes it more like copy/paste work (no magic ;-). If you replace the previous select statement with the one below, it will kind of demonstrate what I mean. The last two columns will return the content of each node instance in an XML format.

    SELECT

    HEA.DER.value('./ns1:QueryId[1]' ,'INT' ) AS QueryId

    ,HEA.DER.value('./ns1:TransactionId[1]' ,'INT' ) AS TransactionId

    ,HRI.CT.value('./ns1:Sequence[1]' ,'NVARCHAR(12)' ) AS Sequence

    ,HRI.CT.value('./ns1:RiskCode[1]' ,'NVARCHAR(12)' ) AS RiskCode

    ,HRI.CT.value('./ns1:Description[1]' ,'NVARCHAR(1024)' ) AS Description

    ,HEA.DER.query('(.)') AS Header

    ,HRI.CT.query('(.)') AS HighRiskIndicator

    FROM @TXTXML.nodes('InstantIDModelResponse') AS RN(ODE)

    OUTER APPLY RN.ODE.nodes('ns1:Header') AS HEA(DER)

    OUTER APPLY RN.ODE.nodes('

    ns1:Result/ns1:Models/ns1:Model/ns1:Scores/ns1:Score/ns1:HighRiskIndicators/ns1:HighRiskIndicator'

    ) AS HRI(CT)

    I will at the first opportunity, compile a list of XML resources to share with the community. And of course, feel free to contact me with questions on the matter, will help if I can 😎

  • You should write another article, Eirikur. One on the very techniques that you've exposed in this thread. I was pretty sure that problem in the original post was going to take quite a few more cascading CROSS APPLYs and you sewed that up really nicely.

    Thanks for the links and the book recommendations. I'll check them out.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/29/2014)


    You should write another article, Eirikur. One on the very techniques that you've exposed in this thread. I was pretty sure that problem in the original post was going to take quite a few more cascading CROSS APPLYs and you sewed that up really nicely.

    Thanks for the links and the book recommendations. I'll check them out.

    Working on it, any suggestions on a data set to work through?

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply