XML Shredding

  • Hi, whats the easiest way to shred this type of attribute centric XML document in such a way that I can query for the "error" type and pull out the "File not found" value?

    <server name="Server1">

    <log>

    <logItem type="LogDate">

    <value string="2012/12/15" />

    </logItem>

    <logItem type="error">

    <value string="File not found" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    </server>

    Thanks

  • You should be able to get it out using xml data type methods.

    http://msdn.microsoft.com/en-us/library/ms190798.aspx

  • As Ray mentions, the xml methods are perfect for this.. here is an example based on your xml

    DECLARE @xml XML

    SET @xml = '<server name="Server1">

    <log>

    <logItem type="LogDate">

    <value string="2012/12/15" />

    </logItem>

    <logItem type="error">

    <value string="File not found" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    </server>'

    SELECT @xml.value('(/server/log/logItem[@type="error"]/value/@string)[1]', 'varchar(50)')

  • Here's a simple version:

    DECLARE @XML XML = '

    <server name="Server1">

    <log>

    <logItem type="LogDate">

    <value string="2012/12/15" />

    </logItem>

    <logItem type="error">

    <value string="File not found" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    </server>';

    SELECT LogItem.value('(/logItem/value/@string)[1]', 'varchar(max)') AS ErrorString

    FROM (SELECT L.Item.query('.') AS LogItem

    FROM @XML.nodes('/server/log/logItem') AS L (Item)) AS Sub

    WHERE LogItem.exist('/logItem[(@type cast as xs:string?) eq xs:string("error")]') = 1;

    The first part just takes your XML and assigns to a variable. If it's a single value, that will work. If it's a column in a table, try something more like this:

    IF OBJECT_ID(N'tempdb..#T') IS NOT NULL

    DROP TABLE #T;

    CREATE TABLE #T

    (ID INT IDENTITY

    PRIMARY KEY,

    LogXML XML NOT NULL);

    INSERT INTO #T

    (LogXML)

    VALUES ('

    <server name="Server1">

    <log>

    <logItem type="LogDate">

    <value string="2012/12/11" />

    </logItem>

    <logItem type="error">

    <value string="File not found" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    </server>'),

    ('

    <server name="Server1">

    <log>

    <logItem type="LogDate">

    <value string="2012/12/10" />

    </logItem>

    <logItem type="error">

    <value string="Another error" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    </server>')

    SELECT ID, LogItem.value('(/logItem/value/@string)[1]', 'varchar(max)') AS ErrorString

    FROM (SELECT ID,

    L.Item.query('.') AS LogItem

    FROM #T AS T

    CROSS APPLY LogXML.nodes('/server/log/logItem') AS L (Item)) AS Sub

    WHERE LogItem.exist('/logItem[(@type cast as xs:string?) eq xs:string("error")]') = 1;

    You'll use your real table, instead of #T, of course. And you won't drop the table, obviously. But the rest should translate fairly easily.

    If you want to look at the documentation for the parts of this, and how the nodes() and value() functions work, the data is here: http://msdn.microsoft.com/en-us/library/ms189075(v=sql.105)

    Does that help?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Thanks all.

    GSquared, that's perfect. It is a field in a table storing XML, not a single XML doc that I'm working with. That probably would have been my next question, so great steer there.

    Regards

    Nick

  • You're welcome. Glad we could help.

    I just saw arthurolcot's post.

    If the log entries are always singular, that will work. I used the nodes() function in my solution in case a single XML entry could possibly contain more than one log entry.

    It won't work if the data ever looks like:

    DECLARE @xml XML

    SET @xml = '<server name="Server1">

    <log>

    <logItem type="LogDate">

    <value string="2012/12/15" />

    </logItem>

    <logItem type="error">

    <value string="File not found" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    <log>

    <logItem type="LogDate">

    <value string="2012/12/15" />

    </logItem>

    <logItem type="error">

    <value string="Another error" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    </server>'

    SELECT @xml.value('(/server/log/logItem[@type="error"]/value/@string)[1]', 'varchar(50)')

    The version I posted will handle that:

    IF OBJECT_ID(N'tempdb..#T') IS NOT NULL

    DROP TABLE #T;

    CREATE TABLE #T

    (ID INT IDENTITY

    PRIMARY KEY,

    LogXML XML NOT NULL);

    INSERT INTO #T

    (LogXML)

    VALUES ('

    <server name="Server1">

    <log>

    <logItem type="LogDate">

    <value string="2012/12/11" />

    </logItem>

    <logItem type="error">

    <value string="File not found" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    </server>'),

    ('

    <server name="Server1">

    <log>

    <logItem type="LogDate">

    <value string="2012/12/10" />

    </logItem>

    <logItem type="error">

    <value string="Another error" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    <log>

    <logItem type="LogDate">

    <value string="2012/12/10" />

    </logItem>

    <logItem type="error">

    <value string="Yet another error" />

    </logItem>

    <logItem type="source">

    <value string="Dir/SubDir/SomeFile" />

    </logItem>

    </log>

    </server>')

    SELECT ID, LogItem.value('(/logItem/value/@string)[1]', 'varchar(max)') AS ErrorString

    FROM (SELECT ID,

    L.Item.query('.') AS LogItem

    FROM #T AS T

    CROSS APPLY LogXML.nodes('/server/log/logItem') AS L (Item)) AS Sub

    WHERE LogItem.exist('/logItem[(@type cast as xs:string?) eq xs:string("error")]') = 1;

    I'm not sure what your data looks like (of course), but I usually try to err on the side of assuming XML will have more than one node. After all, that's part of what it's for.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply