Construct XML data using structure shown in varchar column

  • I need to construct XML from data which 'hints' at the structure of the required data but I'm struggling to work out how to do this and avoiding cursors.

    The extracted data has a column that effectively shows the node that it the value is to be stored under and where that node belongs in the hierarchy:

    CREATE TABLE #XML_SOURCE(

    IDINTIDENTITY,

    AssetINT,

    AreaINT,

    ZoneINT,

    RoomINT,

    XML_StructureVarChar(MAX),

    ValueVarChar(MAX)

    );

    INSERT INTO #XML_SOURCE(Asset, Area, Zone, Room, XML_Structure, Value)

    VALUES(100,7,2,NULL,'Asset-Details>Built-Type', '12'),

    (101,8,3,NULL,'Asset_Details>Built-Type', '9'),

    (100,7,2,NULL,'Asset-Details>Extended', '0'),

    (101,8,3,NULL,'Asset-Details>Extended', '3'),

    (101,8,3,NULL,'Asset-Details>WHHM>Roof-Type', '55'),

    (100,7,2,2,'Asset-Details>Area-Type>Length', '100'),

    (100,7,2,1,'Asset-Details>Area-Type>Length', '150'),

    (101,8,3,1,'Asset-Details>Roof>Type', '32'),

    (101,8,3,1,'Asset-Details>Roof>Hieght', '33'),

    (101,8,3,2,'Asset-Details>Roof>Type', '34'),

    (101,8,3,2,'Asset-Details>Scheme', NULL);

    select * from #XML_SOURCE;

    Which should result in the following xml:

    <Assets>

    <Asset ID="100">

    <SYS>

    <Asset-Area>

    <AreaCode>7</AreaCode>

    <Zone>2</Zone>

    </Asset-Area>

    <Asset-Details>

    <Built-Type>12</Built-Type>

    <Extended>0</Extended>

    <Property-Parts>

    <Property-Part>

    <PartNo>1</PartNo>

    <Area-Type>

    <Length>150</Length>

    </Area-Type>

    </Property-Part>

    <Property-Part>

    <PartNo>2</PartNo>

    <Area-Type>

    <Length>100</Length>

    </Area-Type>

    </Property-Part>

    </Property-Parts>

    </Asset-Details>

    <Asset-Data />

    </SYS>

    </Asset>

    <Asset ID="101">

    <SYS>

    <Asset-Area>

    <AreaCode>8</AreaCode>

    <Zone>3</Zone>

    </Asset-Area>

    <Asset-Details>

    <Built-Type>9</Built-Type>

    <Extended>3</Extended>

    <WHHM>

    <Roof-Type>55</Roof-Type>

    </WHHM>

    <Property-Parts>

    <Property-Part>

    <PartNo>1</PartNo>

    <Roof>

    <Type>32</Type>

    <Height>33</Height>

    </Roof>

    </Property-Part>

    <Property-Part>

    <PartNo>2</PartNo>

    <Roof>

    <Type>34</Type>

    </Roof>

    <Scheme />

    </Property-Part>

    </Property-Parts>

    </Asset-Details>

    <Asset-Data />

    </SYS>

    </Asset>

    </Assets>

    The column 'XML_Structure' shows each level, separated by '>'. Now, looking at this I can see what node the value needs to be under and where that belongs in relation to other nodes. However, I'm struggling to translate this in something SQL can handle. And of course, some levels are implied although there aren't values directly associated with them. An example here is the 'WHHM' in one of the entries.

    I've spent quite a bit of time writing code to get this data which avoids cursors (I hadn't appreciated just how useful Pivot and UnPivot can be in that area), as the speed of this procedure is important. It has the potential to be fed an inordinate amount of data, and I really don't want any cursor-type code to step in right at the end and drag it down.

    Hopefully this simplified version of the requirements can provide somebody out there with enough information to show me a way to solve this. I won't bore you (or embarrass myself) with the various pieces of code that I have tried thus far.

    Thanks for your time.

  • Not an easy one.. I have come up with this rather crude way of creating the xml that you require from that data. This isn't great and it isn't going to win any prizes and I'm sure that it can be written a lot neater but I'm short on time at the moment.

    ;

    WITH XMLSource AS

    (SELECT Asset, Area, Zone FROM #XML_SOURCE

    GROUP BY Asset, Area, Zone),

    innerXml AS

    (

    SELECT Asset, Area, Zone,

    (SELECT DISTINCT x2.Room AS 'Property-Part/PartNo'

    , x6.Value AS 'Property-Part/Area-Type/Length'

    , x3.Value AS 'Property-Part/Roof/Type'

    , x4.Value AS 'Property-Part/Roof/Height'

    , x5.Value AS 'Property-Part/Scheme'

    FROM #XML_SOURCE x2

    LEFT JOIN #XML_SOURCE x3 ON x2.Asset = x3.Asset AND x2.Area = x3.Area AND x2.Zone = x3.Zone AND x2.room = x3.Room AND x3.xml_structure = 'Asset-Details>Roof>Type'

    LEFT JOIN #XML_SOURCE x4 ON x2.Asset = x4.Asset AND x2.Area = x4.Area AND x2.Zone = x4.Zone AND x2.room = x4.Room AND x4.xml_structure = 'Asset-Details>Roof>Hieght'

    LEFT JOIN #XML_SOURCE x5 ON x2.Asset = x5.Asset AND x2.Area = x5.Area AND x2.Zone = x5.Zone AND x2.room = x5.Room AND x5.xml_structure = 'Asset-Details>Scheme'

    LEFT JOIN #XML_SOURCE x6 ON x2.Asset = x6.Asset AND x2.Area = x6.Area AND x2.Zone = x6.Zone AND x2.room = x6.Room AND x6.xml_structure = 'Asset-Details>Area-Type>Length'

    WHERE x1.Asset = x2.Asset AND x1.Area = x2.Area AND x1.Zone = x2.Zone AND x2.Room is NOT NULL

    FOR XML PATH(''), TYPE

    ) XMLData

    FROM XMLSource x1

    )

    SELECT x1.Asset AS '@ID'

    , x1.Area AS 'SYS/Asset-Area/AreaCode'

    , x1.Zone AS 'SYS/Asset-Area/Zone'

    , x2.Value AS 'SYS/Asset-Details/Built-Type'

    , x3.Value AS 'SYS/Asset-Details/Extended'

    , x4.Value AS 'SYS/Asset-Details/WHHM/Roof-Type'

    , i.XMLData AS 'SYS/Asset-Details/Property-Parts'

    FROM

    (

    SELECT Asset, Area, Zone FROM #XML_SOURCE

    GROUP BY Asset, Area, Zone

    ) x1

    LEFT JOIN innerXml i ON i.Asset = x1.Asset AND i.Area = x1.Area AND i.Zone = x1.Zone

    LEFT JOIN #XML_SOURCE x2 ON x1.Asset = x2.Asset AND x1.Area = x2.Area AND x1.Zone = x2.Zone AND x2.xml_structure = 'Asset-Details>Built-Type'

    LEFT JOIN #XML_SOURCE x3 ON x1.Asset = x3.Asset AND x1.Area = x3.Area AND x1.Zone = x3.Zone AND x3.xml_structure = 'Asset-Details>Extended'

    LEFT JOIN #XML_SOURCE x4 ON x1.Asset = x4.Asset AND x1.Area = x4.Area AND x1.Zone = x4.Zone AND x4.xml_structure = 'Asset-Details>WHHM>Roof-Type'

    FOR XML PATH('Asset'), ROOT('Assets')

    note that there are a couple of typos in your example xml. The main one was "Asset_Details" compared to "Asset-Details" which you used in the rest of the records.

    If I had a bit more time, I would look at trying to pivot the source data rather than loads of joins into a much wider table and then generating the required XML from that. I might do that over the weekend if i get time.

  • Thanks for taking the time with this problem of mine - I can just about get my head around it! I'll have a further look at it over the weekend.

  • My apologies for taking so long to get back to you.

    I have finally understood the code that you posted and it is far better than anything that I had come up with. The problem that I'll have with it is that the 'XML_Structure' column could contain a large number of unique values, and of course each of these will require a separate join, making the dynamic sql quite lengthy and involved.

    Because it transpires that the XML layout hasn't been finalised and the timescale that I had, I have currently implemented the dreaded cursor. Not my favourite option, but it gives the rest of the development team something that they can plug their interface software into. I now have a couple of weeks where I can work on your suggestion and see how it performs.

    I will confess I had a think about the pivot option, but for this purpose I just couldn't get my head around it.

    My thanks for the time that you spent working on this, it is appreciated.

  • What is the command that makes you go from

    CREATE TABLE #XML_SOURCE(

    ...

    select * from #XML_SOURCE;

    to

    <Assets>

    ...

    </Assets>

    ?

    And what is a good book to learn the syntax used by arthurolcot in his reply?

    For example what does

    ,

    innerXml AS

    mean?

    And AS 'Property-Part/PartNo'

    ?

    Why is x1 used twice: FROM XMLSource x1

    ) x1

    ?

    As an aside, I've tried colouring the bits above to match the colour of the letters in the original in order to make it more readable but if you click on the Font Color icon you get a message: "Enter the hexadecimal font color value to use"; how should one know that?

  • If I read your requirements correctly, you want an XML structure with an <Asset-Details> element that may have many child elements, some of them with child elements of their own. You don't supply any sample data or give any indication about what the underlying table structure looks like, but I'm guess that you are constructing the #XML_SOURCE table from a combination of metadata and values from your database. Assuming your database has the appropriate keys that allow you to associate all the necessary columns of data with the appropriate asset, you don't need the #XML_SOURCE table - you can get what you want by SELECTing directly from your underlying tables and using nested XML.

    Here's an example of how it might work:

    SELECT a.Asset_ID

    ,(SELECT g.AreaCode, g.Zone

    FROM Asset_Location g

    WHERE a.Asset_ID = g.Asset_ID

    FOR XML PATH, ELEMENTS, TYPE) as [Asset-Area]

    ,(SELECT d.Built_Type, d.Scheme

    FROM Asset_Details d

    WHERE a.Asset_ID = d.Asset_ID

    FOR XML PATH, ELEMENTS, TYPE) as [Asset-Details]

    FROM Assets a

    FOR XML PATH ('Asset'), ROOT('Assets'), ELEMENTS, TYPE

    The output would be this:

    <Assets>

    <Asset>

    <Asset_ID>100</Asset_ID>

    <Asset-Area>

    <AreaCode>7</AreaCode>

    <Zone>2</Zone>

    </Asset-Area>

    <Asset-Details>

    <Built_Type>1</Built_Type>

    <Scheme>X12</Scheme>

    </Asset-Details>

    </Asset>

    </Assets>

    The FOR XML clause prompts SQL Server to generate an XML structure using column names as the element names. Without knowing more about your data and database schema, I can't be more specific than that, though. The FOR XML clause will do everything you seem to need (as I understand your needs). Read up on it here: http://msdn.microsoft.com/en-us/library/ms178107.aspx

    Jason Wolfkill

  • arthurolcot (7/6/2012)

    If I had a bit more time, I would look at trying to pivot the source data rather than loads of joins into a much wider table and then generating the required XML from that.

    I managed it at last! Thanks for your help and for a putting the idea of the pivot into my head. Now that I've completed it, it looks far more elegant than anything else I was attempting.

    And as ever, what looked incredibly complex when starting out now looks so much easier at the end.

    Now I just need to document it, so the other developers don't run away screaming from it, if it should ever need updating.

  • That's great! Glad that you was able to achieve what you needed too. Sorry I've not replied before now but Ive been enjoying a nice break in Scotland.

    Pivot'ing is really nice in your scenario but it does take a bit to get your head around it at first. I still find myself reaching for BOL now when I need to pivot!

    Thanks for the feedback.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply