SQL Query to extract XML data

  • Hi,

    I am using sqlserver 2005, one of my database tables "searches" has a column "xmltext" which is of "ntext" datatype. This column has xml data stored, I need to extract some fields from this xml string for report purpose. Below you will see the xml string, I need to extract

    email: "abc@abc.com" ,

    from: "2007-11-01T00:00:00-04:00"

    to: "2008-06-30T00:00:00-04:00"

    content: "quot;FL Salt" "F.L. Salt" "Project X" baltimore"

    Can some one help me with the sql statement to extract these fields. So far I have this query but it doesnt give any data. I was trying this just for the email data.

    select col.d2p1.value('.','nvarchar(max)')

    from (select cast(xmltext as xml) as col

    from tblintsearches

    where caseid = 72

    and searchid = 464) as sub

    cross apply col.nodes('/SearchCriteriaDS/SearchCriterias/SearchRecipients/SearchRecipient/@d2p1') col(d2p1)

    XML String

    ======================================================

    =========================================================================

    TIA,

    Pallav.

  • Hi, some how the XML does not show up in your post. Could you replace the XML tags with '[]' and see if that works?

    .

  • Sorry about that I didnt realize that it didnt post the xml string. I have replaced all " " tags with "[","]".

    [SearchCriteriaDS xmlns="http://www.teuri.org/SearchCriteriaDS.xsd"]

    [SearchCriterias d2p1:Name="abc@abc.gov-emails" d2p1:SearchTemplateID="-1" d2p1:OnlyNew="true" d2p1:SourceTargetDirection="3" d2p1:SampleSize="100" d2p1:MinimumNumberRecipient="0" d2p1:AutoAccept="false" d2p1:SampleCap="-1" d2p1:FreeformTargetsOnly="false" xmlns:d2p1="http://www.tempuri.org/SearchCriteriaDS.xsd"]

    [PolicyTypes]1[/PolicyTypes]

    [PolicyFilterByCase]false[/PolicyFilterByCase]

    [IncludeNonIndexed]false[/IncludeNonIndexed]

    [SearchRecipients d2p1:Type="4" d2p1:Operant="1"]

    [SearchRecipient d2p1:Name="abc@abc.com" d2p1:Type="2" d2p1:ID="-1" /]

    [/SearchRecipients]

    [SearchDates d2p1:Operant="1"]

    [SearchDate d2p1:From="2007-11-01T00:00:00-04:00" d2p1:To="2008-06-30T00:00:00-04:00" d2p1:Operant="1" d2p1:Type="2" d2p1:RelativeDisplay="0" d2p1:RelativeValue="0" d2p1:DateIndexAttribute="0" /]

    [/SearchDates]

    [SearchContents d2p1:Operant="0"]

    [IncludeSubject]true[/IncludeSubject]

    [IncludeAll]false[/IncludeAll]

    [SearchContent d2p1:Text=""FL Salt" "F.L. Salt" "Project X" baltimore" d2p1:Operant="0" d2p1:ID="-1" d2p1:Type="1" /]

    [/SearchContents]

    [RetentionCategories d2p1:Operant="0" /]

    [Policies d2p1:Operant="0" /]

    [/SearchCriterias]

    [/SearchCriteriaDS]

  • You need to fix a few issues with the XML documen before you can process them.

    1. You are using a namespace prefix in the attribute names. But the namespace prefix is not declared in the XML document.

    2. An attribute value cannot accpet a double quote as part of the value. you need to encode it. For example, this is wrong: [d2p1:Text=""FL Salt" "F.L. Salt""].

    Do you have control over this data, so that you can correct them? Or is it coming from some other application?

    .

  • Hi Jacob,

    Thanks! for the reply. I dont have control over the XML document not even sure how it gets loaded into the column, I was just assigned to generate report based on this XML.

    Are there any workarounds?

    Thanks!

    Pallav

  • I see that there is a namespace declared already. I did not see it earlier because it was on the far right corner.

    I replaced the double quotes and the following works.

    DECLARE @x XML

    SELECT @x = '

    [SearchCriteriaDS xmlns="http://www.teuri.org/SearchCriteriaDS.xsd"]

    [SearchCriterias d2p1:Name="abc@abc.gov-emails" d2p1:SearchTemplateID="-1" d2p1:OnlyNew="true" d2p1:SourceTargetDirection="3" d2p1:SampleSize="100" d2p1:MinimumNumberRecipient="0" d2p1:AutoAccept="false" d2p1:SampleCap="-1" d2p1:FreeformTargetsOnly="false" xmlns:d2p1="http://www.tempuri.org/SearchCriteriaDS.xsd"]

    [PolicyTypes]1[/PolicyTypes]

    [PolicyFilterByCase]false[/PolicyFilterByCase]

    [IncludeNonIndexed]false[/IncludeNonIndexed]

    [SearchRecipients d2p1:Type="4" d2p1:Operant="1"]

    [SearchRecipient d2p1:Name="abc@abc.com" d2p1:Type="2" d2p1:ID="-1" /]

    [/SearchRecipients]

    [SearchDates d2p1:Operant="1"]

    [SearchDate d2p1:From="2007-11-01T00:00:00-04:00" d2p1:To="2008-06-30T00:00:00-04:00" d2p1:Operant="1" d2p1:Type="2" d2p1:RelativeDisplay="0" d2p1:RelativeValue="0" d2p1ateIndexAttribute="0" /]

    [/SearchDates]

    [SearchContents d2p1:Operant="0"]

    [IncludeSubject]true[/IncludeSubject]

    [IncludeAll]false[/IncludeAll]

    [SearchContent d2p1:Text="FL Salt F.L. Salt Project X baltimore" d2p1:Operant="0" d2p1:ID="-1" d2p1:Type="1" /]

    [/SearchContents]

    [RetentionCategories d2p1:Operant="0" /]

    [Policies d2p1:Operant="0" /]

    [/SearchCriterias]

    [/SearchCriteriaDS]'

    ;WITH XMLNAMESPACES (

    DEFAULT 'http://www.teuri.org/SearchCriteriaDS.xsd',

    'http://www.tempuri.org/SearchCriteriaDS.xsd' AS d2p1

    )

    SELECT

    x2.value('@d2p1:Name[1]','VARCHAR(25)') AS email,

    x3.value('@d2p1:From[1]','VARCHAR(25)') AS FromDate

    FROM @x.nodes('/SearchCriteriaDS/SearchCriterias') AS s1(x1)

    CROSS APPLY s1.x1.nodes('SearchRecipients/SearchRecipient') AS s2(x2)

    CROSS APPLY s1.x1.nodes('SearchDates/SearchDate') AS s3(x3)

    /*

    OUTOUT:

    email FromDate

    ------------------------- -------------------------

    abc@abc.com 2007-11-01T00:00:00-04:00

    */

    You need to replace the double quotes some how.

    .

  • Hi Jacob,

    How do I modify this sql statement to get the result?

    select col.d2p1.value('.','nvarchar(max)')

    from (select cast(xmltext as xml) as col

    from tblintsearches

    where caseid = 72

    and searchid = 464) as sub

    cross apply col.nodes('/SearchCriteriaDS/SearchCriterias/SearchRecipients/SearchRecipient/@d2p1') col(d2p1)

    Thanks! again for your help.

  • I created an example that closely matches your requirement. See this example:

    DECLARE @t TABLE (data NVARCHAR(MAX))

    INSERT INTO @t (data) SELECT '

    [SearchCriteriaDS xmlns="http://www.teuri.org/SearchCriteriaDS.xsd"]

    [SearchCriterias d2p1:Name="abc@abc.gov-emails" d2p1:SearchTemplateID="-1" d2p1:OnlyNew="true" d2p1:SourceTargetDirection="3" d2p1:SampleSize="100" d2p1:MinimumNumberRecipient="0" d2p1:AutoAccept="false" d2p1:SampleCap="-1" d2p1:FreeformTargetsOnly="false" xmlns:d2p1="http://www.tempuri.org/SearchCriteriaDS.xsd"]

    [PolicyTypes]1[/PolicyTypes]

    [PolicyFilterByCase]false[/PolicyFilterByCase]

    [IncludeNonIndexed]false[/IncludeNonIndexed]

    [SearchRecipients d2p1:Type="4" d2p1:Operant="1"]

    [SearchRecipient d2p1:Name="abc@abc.com" d2p1:Type="2" d2p1:ID="-1" /]

    [/SearchRecipients]

    [SearchDates d2p1:Operant="1"]

    [SearchDate d2p1:From="2007-11-01T00:00:00-04:00" d2p1:To="2008-06-30T00:00:00-04:00" d2p1:Operant="1" d2p1:Type="2" d2p1:RelativeDisplay="0" d2p1:RelativeValue="0" d2p1ateIndexAttribute="0" /]

    [/SearchDates]

    [SearchContents d2p1:Operant="0"]

    [IncludeSubject]true[/IncludeSubject]

    [IncludeAll]false[/IncludeAll]

    [SearchContent d2p1:Text="FL Salt F.L. Salt Project X baltimore" d2p1:Operant="0" d2p1:ID="-1" d2p1:Type="1" /]

    [/SearchContents]

    [RetentionCategories d2p1:Operant="0" /]

    [Policies d2p1:Operant="0" /]

    [/SearchCriterias]

    [/SearchCriteriaDS]'

    ;WITH XMLNAMESPACES (

    DEFAULT 'http://www.teuri.org/SearchCriteriaDS.xsd',

    'http://www.tempuri.org/SearchCriteriaDS.xsd' AS d2p1

    )

    SELECT

    x2.value('@d2p1:Name[1]','VARCHAR(25)') AS email,

    x3.value('@d2p1:From[1]','VARCHAR(25)') AS FromDate

    FROM (

    SELECT

    CAST(data AS XML) AS col

    FROM @t

    ) as sub

    CROSS apply col.nodes('/SearchCriteriaDS/SearchCriterias') AS s1(x1)

    CROSS APPLY s1.x1.nodes('SearchRecipients/SearchRecipient') AS s2(x2)

    CROSS APPLY s1.x1.nodes('SearchDates/SearchDate') AS s3(x3)

    /*

    OUTOUT:

    email FromDate

    ------------------------- -------------------------

    abc@abc.com 2007-11-01T00:00:00-04:00

    */

    .

  • I have changed my sql this way

    select x2.value('@d2p1:Name[1]','VARCHAR(25)') AS email,

    x3.value('@d2p1:From[1]','VARCHAR(25)') AS FromDate

    from (select cast(xmltext as xml) as col

    from tblintsearches

    where caseid = 72

    and searchid = 464) as sub

    CROSS apply col.nodes('/SearchCriteriaDS/SearchCriterias') AS s1(x1)

    CROSS APPLY s1.x1.nodes('SearchRecipients/SearchRecipient') AS s2(x2)

    CROSS APPLY s1.x1.nodes('SearchDates/SearchDate') AS s3(x3)

    But I still get this error

    Msg 2229, Level 16, State 1, Line 1

    XQuery [sub.col.value()]: The name "d2p1" does not denote a namespace.

    Do I need to add this to XML data?

    ;WITH XMLNAMESPACES (

    DEFAULT 'http://www.teuri.org/SearchCriteriaDS.xsd',

    'http://www.tempuri.org/SearchCriteriaDS.xsd' AS d2p1

  • Yes, you need to add the namespace declaration. Also, make sure that you use the same expression in the value() method as given in my example.

    .

  • Hi Jacob,

    The table "tblintsearches" has 70k records, this report should run against all these records and I dont have any write access to this table.

    Is it possible to get this from straight sql? I know that I am asking a lot here but just want to make sure if it is possible this way or not?

    Thanks!

    Pallav

  • Pallav,

    By namespace declaration, i meant the statement 'WITH XMLNAMESPACES' at the beginning of your query. Run the previous example and see if it works well on your table.

    .

  • Hi Jason,

    Thanks! a lot it worked I used this way and it worked

    ;WITH XMLNAMESPACES (

    DEFAULT 'http://www.tempuri.org/SearchCriteriaDS.xsd',

    'http://www.tempuri.org/SearchCriteriaDS.xsd' AS d2p1

    )

    select x3.value('@d2p1:From[1]','VARCHAR(25)') AS StartDate,

    x3.value('@d2p1:To[1]','VARCHAR(25)') AS EndDate,

    x2.value('@d2p1:Name[1]','VARCHAR(25)') AS email,

    x4.value('@d2p1:Text[1]','VARCHAR(250)') AS content, numhits, searchid, caseid

    from (select caseid, searchid, numhits, cast(xmltext as xml) as col

    from tblintsearches

    where caseid = 73) as sub

    CROSS apply col.nodes('/SearchCriteriaDS/SearchCriterias') AS s1(x1)

    CROSS APPLY s1.x1.nodes('SearchRecipients/SearchRecipient') AS s2(x2)

    CROSS APPLY s1.x1.nodes('SearchDates/SearchDate') AS s3(x3)

    CROSS APPLY s1.x1.nodes('SearchContents/SearchContent') AS s4(x4)

    Now I have an another question I need to join the result of this query with another query based on the case id.

    Is it possible? with the namespace in front of the query this is problems when i tried it.

    select tc.name, tc.caseid, tr.rolename, tp.principalname, tpm.name as permissions

    from tblintsecurity ts, tblrole tr, tblprincipal tp, tblpermission tpm,

    tblintrolepermission trp, tblcase tc

    where ts.roleid = tr.roleid

    and ts.principalid = tp.principalid

    and tr.roleid = trp.roleid

    and trp.permissionid = tpm.permissionid

    and ts.caseid = tc.caseid

    and tpm.name = 'Review'

    and tr.rolename <> 'Admin'

    and tc.caseid = 73

  • Never mind, I got it. Thanks! a lot for your help.

  • You are welcome!

    .

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply