Extracting and parsing XML from fields (CRM related)

  • I'm a relative novice when it comes to SQL, so please be gentle with me. 😛

    I've been getting on reasonably well producing reports from our Microsoft CRM-based application, but have come across something I've no experience of before. CRM allows ResourceGroups to be created to which you can assign resources. In our case, we're using these ResourceGroups to create team groupings, which we can then use in reports without having to hard code the groupings.

    The problem I have is the way in which CRM stores these groupings. I was expecting that the relationships would be defined in a table, (one group to many resources), but it appears that they are actually stored as XML within a field so the SQL would be:

    SELECT ResourceGroupBase.*, ConstraintBasedGroupBase.*

    FROM ResourceGroupBase INNER JOIN

    ConstraintBasedGroupBase ON ResourceGroupBase.ResourceGroupId = ConstraintBasedGroupBase.ConstraintBasedGroupId

    and the XML that describes members of the group is contained in the 'Constraint' fiels:

    <Constraints><Constraint><Expression><Body>resource["Id"] == {005b7cce-de01-e011-a55b-0050569d00bf} || resource["Id"] == {3dcbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {3fcbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {49cbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {4ccbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {4dcbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {4ecbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {4fcbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {53cbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {55cbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {5acbc1d6-a3e1-df11-a501-00188b8b43b5} || resource["Id"] == {71cbc1d6-a3e1-df11-a501-00188b8b43b5}</Body><Parameters><Parameter name="resource" /></Parameters></Expression></Constraint></Constraints>

    This XML contains the GUIDS / IDs I require to link to the TeamBase table and get my membership, but I've never come across a situation such as this and don't know how to proceed with my query.

    Is it possible (using SQL) to make a query that returns the resource group and associated teams such as:

    ResourceGroup Team

    Group1 Team A

    Group1 Team B

    Group1 Team C

    Group2 Team Y

    Group3 Team Z

    If so, how would I go about it? I'm guessing it would involve parsing the XML, but I have no experience of this or how / if it is possible?

    Stuart

  • Double post

  • I take it that the lack of response mean 'we don't know' 🙁

    Before I give up, can someone tell me if it's possible to move this post into the T-SQL forum without having to cross-post (as a little more research has suggested to me that the solution might be T-SQL)?

    Thanks

    Stuart

  • Would it be possible to have a little more info.

    You have two tables;

    ResourceGroupBase = ResourceGroupId (the GUID and unique reference to this table) and some details about the ResourceGroupBase

    ConstraintBasedGroupBase = ConstraintBasedGroupId (according to your sql this is the same as the ResourceGroupBase but I would have thought it would be the unique reference for this table)

    I understand you want the Group each team is in but I don't understand where this information is in the tables.

    I assume the xml given is an example out of one column in one of the tables (the ConstraintBasedGroupBase?).

    Could you give us the schema's for the tables and a couple of examples from the relevant fields.

    Thanks,

    Peter

  • The lack of responses might also be an indicator that we don't have enough information yet.

    You're looking for a "resource group and associated teams" but neither is part of the xml structure you posted.

    Also, the xml is not really in a ready to use format. Xml data is better presented using the xml code tag.

    Your xml data would look like this:

    <Constraints>

    <Constraint>

    <Expression>

    <Body>

    resource["Id"] == {005b7cce-de01-e011-a55b-0050569d00bf} ||

    resource["Id"] == {3dcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {3fcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {49cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4ccbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4dcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4ecbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4fcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {53cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {55cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {5acbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {71cbc1d6-a3e1-df11-a501-00188b8b43b5}

    </Body>

    <Parameters><Parameter name="resource" />

    </Parameters>

    </Expression>

    </Constraint>

    </Constraints>

    Based on those data, what would be your expected result?



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I don't think that there is much more information to give, so I'll ramble for a bit and hope that helps.

    Our company purchased a configuration of Microsoft CRM to use with one of our key business processes. This is a heavily customised implementation of MS CRM, but the bit that I'm dealing with for this question (Resource Groups) is a native MS CRM feature - it has not been changed from the vanilla CRM functionality.

    Resource Groups (as I understand it) are a grouping of one or more different entities within the application instance - a Resource Group can contain any combination of these different entities (e.g. team, user, equipment, etc.). For example, you may have a resource group consisting of three teams, one user, and two items of equipment.

    For reasons I will not pretend to understand, the Resource Group's main information is stored in the ResourceGroupBase table and the members of that group are stored in ConstraintBasedGroupBase. These two tables have a one-to-one relationship and are joined on their respective IDs (ResourceGroupId and ConstraintBasedGroupId). I honestly have no understanding why they are two different tables, as ConstraintBasedGroupBase simply appears to be an extension of the ResourceGroupBase table (as far as I'm aware with my limited knowledge of the application).

    ResourceGroupBase contains a field called 'Name', which gives the name of the Resource Group. ConstraintBasedGroupBase contains a filed called 'Constraints', which contains the members of the corresponding Resource Group, stored as xml. As you have already seen, it is not what you might have thought of as properly constructed xml, as the different member IDs are contained within the same element. You will also notice the absence of anything to identify what the IDs represent - are they a team, user, or item of equipment? As far as I understand, the only way you're going to find out which is by plugging the id into the different tables and see which one gives you a result.

    I know that this all seems very odd, but to get a better understanding of why it-is-how-it-is, I'm afraid you'll have to ask Microsoft, 'cause neither I nor my vendor know, and neither of us have any experience of working with this particular functionality.

    I don't have access to the database at the moment, but I'm pretty confident that I can give all of the fields of relevance in each of the tables:

    ResourceGroupBase table

    GUID / primary key = 'ResourceGroupId'

    Resource Group name = 'Name'

    ConstraintBasedGroupBase table

    GUID / primary key = 'ConstraintBasedGroupId'

    Field containing group members = 'Constraints'

    TeamBase

    GUID / primary key = 'TeamId'

    Team name = 'Name'

    Notes

    * ResourceGroupBase and ConstraintBasedGroupBase have IDs with different names, but they contain the same values

    * Records from ResourceGroupBase and ConstraintBasedGroupBase have a one-to-one relationship

    * For my problem, Resource Groups contain only teams (no users or items of equipment)

    I could condense the problem down to using a single table - ConstraintBasedGroupBase - to remove any confusion about the other tables involved. If we approach the problem from that perspective, then we only need to look at two fields: 'ConstraintBasedGroupId' and 'Constraints'. If I show the type values of these fields for one row, we might get (with some artificially added formatting to make it look better):

    ConstraintBasedGroupId = '065bbcce-eb01-e003-a59b-5550569a00af'

    Constraints = '<Constraints>

    <Constraint>

    <Expression>

    <Body>

    resource["Id"] == {005b7cce-de01-e011-a55b-0050569d00bf} ||

    resource["Id"] == {3dcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {3fcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {49cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4ccbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4dcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4ecbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4fcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {53cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {55cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {5acbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {71cbc1d6-a3e1-df11-a501-00188b8b43b5}

    </Body>

    <Parameters><Parameter name="resource" />

    </Parameters>

    </Expression>

    </Constraint>

    </Constraints>'

    My aim is to convert this single row into a table containing 12 rows as follows:

    ConstraintBasedGroupId TeamId

    '065bbcce-eb01-e003-a59b-5550569a00af' '005b7cce-de01-e011-a55b-0050569d00bf'

    '065bbcce-eb01-e003-a59b-5550569a00af' '3dcbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '3fcbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '49cbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '4ccbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '4dcbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '4ecbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '4fcbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '53cbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '55cbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '5acbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '71cbc1d6-a3e1-df11-a501-00188b8b43b5'

    The next row in the ConstraintBasedGroupBase table would then append more results to this table.

    Once I have this table, it is trivial for me to get the Resource Group's name from ResourceGroupBase and the member team's name from TableBase, as I have the corresponding IDs for each table.

    So, my problems might be expanded into the following steps:

    1) Extract the XML from the table field (ConstraintBasedGroupBase.Constraints)

    2) Extract the string value for the <body> element from the XML

    3) Parse the string to extract the different team IDs

    4) Insert those IDs into a temporary table, along with the corresponding value for ConstraintBasedGroupId

    5) Move onto the next row and repeat until I've extracted IDs for all members of all ResourceGroups

    6) Use my temp table to get the names of the Resource Groups and corresponding teams.

    I can probably bumble my way through most of these steps, but at the moment I can't even achieve the first two to get me started.

    Everything clear as mud? It certainly is for me! :pinch:

    Stuart

  • Please either post the expected result based on the first sample or the source data of your expected second result set.

    Remember: we don't know the business case you're dealing with nor can we look over your shoulder to see what you see. All we have is what you post. And the more consistent that is, the easier it is for us to "get" wha tyou're looking for.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Isn't that what I've done in the second post?

    I've presented a table consisting of two fields:

    ConstraintBasedGroupId

    Constraints

    I've presented sample values for these two fields (i.e. one row of the source table):

    ConstraintBasedGroupId = '065bbcce-eb01-e003-a59b-5550569a00af'

    Constraints = '<Constraints>

    <Constraint>

    <Expression>

    <Body>

    resource["Id"] == {005b7cce-de01-e011-a55b-0050569d00bf} ||

    resource["Id"] == {3dcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {3fcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {49cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4ccbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4dcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4ecbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {4fcbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {53cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {55cbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {5acbc1d6-a3e1-df11-a501-00188b8b43b5} ||

    resource["Id"] == {71cbc1d6-a3e1-df11-a501-00188b8b43b5}

    </Body>

    <Parameters><Parameter name="resource" />

    </Parameters>

    </Expression>

    </Constraint>

    </Constraints>'

    And I've presented the output I'm trying to achieve for that single source row (a table containing multiple rows)

    ConstraintBasedGroupId TeamId

    '065bbcce-eb01-e003-a59b-5550569a00af' '005b7cce-de01-e011-a55b-0050569d00bf'

    '065bbcce-eb01-e003-a59b-5550569a00af' '3dcbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '3fcbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '49cbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '4ccbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '4dcbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '4ecbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '4fcbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '53cbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '55cbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '5acbc1d6-a3e1-df11-a501-00188b8b43b5'

    '065bbcce-eb01-e003-a59b-5550569a00af' '71cbc1d6-a3e1-df11-a501-00188b8b43b5'

    Or am I missing something fundamental? :unsure:

    Stuart

  • There wasn't a missing part but rathe some confusion (there wasn't a table presented but two variables assigned with values and the first post did just include some plain values....)

    Anyway, here's something that should get you started:

    ;WITH cte AS

    (

    SELECT

    ConstraintBasedGroupId,

    T.c.value ('Body[1]','varchar(8000)') AS col

    FROM #temp t1

    CROSS APPLY t1.CONSTRAINTS.nodes('Constraints/Constraint/Expression') T(c)

    )

    SELECT

    cte.ConstraintBasedGroupId,

    LTRIM(split.item) AS item

    FROM cte

    CROSS APPLY

    dbo.DelimitedSplit8K(REPLACE(col,'||','|'),'|') split

    You'll need the DelimitedSplit8K function to split the data. If you don't have it in your tool box already, search this site for it (originally posted by Jeff Moden).



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Thanks. I'll give that a whirl when I'm in the office on Monday.

    Stuart

  • LutzM (1/7/2012)


    You'll need the DelimitedSplit8K function to split the data. If you don't have it in your tool box already, search this site for it (originally posted by Jeff Moden).

    Lutz,

    Thank you for the reference but you might want to update the "Delimited Split Function" link in your signature to the newer, more performant version at the following link...

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (1/8/2012)


    LutzM (1/7/2012)


    You'll need the DelimitedSplit8K function to split the data. If you don't have it in your tool box already, search this site for it (originally posted by Jeff Moden).

    Lutz,

    Thank you for the reference but you might want to update the "Delimited Split Function" link in your signature to the newer, more performant version at the following link...

    http://www.sqlservercentral.com/articles/Tally+Table/72993/

    Done as requested 😀

    Thank you for the updated link and the reminder to change my signature!



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • My pleasure. It's an honor that you're carrying some of my articles as a link in your signature. Thanks, Lutz.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I've taken the suggested code and played around with it to get a better understaning of how it works. The fruits of my playing are below:

    WITH TempTable AS (

    SELECT ConstraintBasedGroupId, Constraints

    FROM ConstraintBasedGroupBase

    )

    SELECT

    TempTable.ConstraintBasedGroupId,

    SUBSTRING(item,CHARINDEX('{',item)+1,CHARINDEX('}',item)-CHARINDEX('{',item)-1) AS TeamId

    FROM TempTable

    CROSS APPLY

    DelimitedSplit8K(REPLACE(Constraints,'||','|'),'|') split

    ) AS ResourceGroupToTeam

    This is fantastic, in that it does what I want it to (it newly introduces me to CROSS APPLY and removes the complexity of messing around with the XML). However, I've fallen foul of a different (and I suspect very simple) problem at the next hurdle.

    The above code returns the IDs in two fields. My first though was to use the above as a derrive table with the other two tables involed to get the text, something like this:

    SELECT ResourceGroupBase.Name AS ResourceGroupName,

    TeamBase.Name AS TeamName

    FROM ResourceGroupBase

    INNER JOIN

    (

    WITH TempTable AS (

    SELECT ConstraintBasedGroupId, Constraints

    FROM ConstraintBasedGroupBase

    )

    SELECT

    TempTable.ConstraintBasedGroupId,

    SUBSTRING(item,CHARINDEX('{',item)+1,CHARINDEX('}',item)-CHARINDEX('{',item)-1) AS TeamId

    FROM TempTable

    CROSS APPLY

    DelimitedSplit8K(REPLACE(Constraints,'||','|'),'|') split

    ) AS ResourceGroupToTeam

    ON ResourceGroupBase.ResourceGroupId = ResourceGroupToTeam.ConstraintBasedGroupId

    INNER JOIN

    TeamBase

    ON ResourceGroupToTeam.TeamId = TeamBase.TeamId

    I've never worked with 'WITH' before, so I'm sure it is has restrictions on its use that I'm falling foul of. Given that this code block will form the basis of other queries (and so it would be nice if it could be discretely inserted as a unit), how should I write the code above to make the final query (ResourceGroupName and TeamName) work?

    Thanks for the responses so far.

    Stuart

  • Common table expressions (the 'WITH' bit) go at the start of the query.

    WITH TempTable AS (

    SELECT ConstraintBasedGroupId, Constraints

    FROM ConstraintBasedGroupBase

    ),

    ResourceGroupToTeam AS (

    SELECT

    TempTable.ConstraintBasedGroupId,

    SUBSTRING(item,CHARINDEX('{',item)+1,CHARINDEX('}',item)-CHARINDEX('{',item)-1) AS TeamId

    FROM TempTable

    CROSS APPLY

    DelimitedSplit8K(REPLACE(Constraints,'||','|'),'|') split

    )

    SELECT ResourceGroupBase.Name AS ResourceGroupName,

    TeamBase.Name AS TeamName

    FROM ResourceGroupBase

    INNER JOIN ResourceGroupToTeam

    ON ResourceGroupBase.ResourceGroupId = ResourceGroupToTeam.ConstraintBasedGroupId

    INNER JOIN TeamBase

    ON ResourceGroupToTeam.TeamId = TeamBase.TeamId

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply