RE: Query XML Value – SQLServerCentral

SSC Guru

Points: 61136

January 23, 2014 at 11:47 am

bryan 83518 (1/15/2014)
Trying to work out how to query xml that looks like this I will loop through all the artist but only want the Main Artist not the Secondary
<Artists>
<ArtistText>Genie Zhuo</ArtistText>
<Artist Type="Main">
<FullName>Genie Zhuo</FullName>
</Artist>
<Artist Type="Secondary">
<FullName>The Second Man</FullName>
</Artist>
</Artists>
<Artists>
<ArtistText>The Man</ArtistText>
<Artist Type="Main">
<FullName>The Man</FullName>
</Artist>
<Artist Type="Secondary">
<FullName>The Second Man</FullName>
</Artist>
</Artists>
This query works but if I change the [1] to 2 or 4 I get the secondary artist
SELECT
XMLContents.value('(/Product/Artists/Artist/FullName)[1]', 'nvarchar(max)') as RecordType
FROM [XMLFile]
where XMLFileId = 1

Richard showed how you could pull the data from a SQL variable. If you had your data in a table you could query the XML like this:

SELECT a.value('(FullName)[1]', 'nvarchar(max)') as RecordType

FROM #sampleXML sx

CROSS APPLY x.nodes('Artists/Artist[@Type="Main"]') y(a)

A couple performance notes:

1st if you are trying to return text (e.g. <xxx>this is text</xxx>) then you want to specify a text node (which I will demonstrate in a moment.)

2nd, it does not appear that you would need nvarchar(max). (max) data types can be killers, for this you should consider something smaller (in my example I used nvarchar(1000).

Here's a test harness:

IF OBJECT_ID('tempdb..#sampleXML') IS NOT NULL DROP TABLE #sampleXML;

CREATE TABLE #sampleXML (xid int primary key, x xml not null)

DECLARE @x varchar(4000)='

<Artists>

<ArtistText>#5</ArtistText>

<Artist Type="Main">

<FullName>Artist #1</FullName>

</Artist>

<Artist Type="Secondary">

<FullName>Artist #2</FullName>

</Artist>

</Artists>

<Artists>

<ArtistText>#6</ArtistText>

<Artist Type="Main">

<FullName>Artist #3</FullName>

</Artist>

<Artist Type="Secondary">

<FullName>Artist #4</FullName>

</Artist>

</Artists>';

DECLARE @rows int=1000;

WITH tally AS

(

SELECT TOP (@rows)

ROW_NUMBER() OVER (ORDER BY (SELECT ($))) AS n

FROM sys.all_columns a CROSS JOIN sys.all_columns b

)

INSERT #sampleXML

SELECT n,

CONVERT(xml,

REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(@x,

'#1','#'+CONVERT(varchar(1000),floor(900000*rand(convert(varbinary, newid())))+100000)),

'#2','#'+CONVERT(varchar(1000),floor(900000*rand(convert(varbinary, newid())))+100000)),

'#3','#'+CONVERT(varchar(1000),floor(900000*rand(convert(varbinary, newid())))+100000)),

'#4','#'+CONVERT(varchar(1000),floor(900000*rand(convert(varbinary, newid())))+100000)),

'#5','random text:'+CONVERT(varchar(1000),floor(900000*rand(convert(varbinary, newid())))+100000)),

'#6','random text:'+CONVERT(varchar(1000),floor(900000*rand(convert(varbinary, newid())))+100000)))

FROM tally

Here's three versions of the same query using the performance improvements I mentioned along with the results:

-- not using text() node

SELECT a.value('(FullName)[1]', 'nvarchar(max)') as RecordType

FROM #sampleXML sx

CROSS APPLY x.nodes('Artists/Artist[@Type="Main"]') y(a)

-- using text() node

SELECT a.value('(FullName/text())[1]', 'nvarchar(max)') as RecordType

FROM #sampleXML sx

CROSS APPLY x.nodes('Artists/Artist[@Type="Main"]') y(a)

-- changing nvarchar(max) to nvarchar(1000)

SELECT a.value('(FullName/text())[1]', 'nvarchar(1000)') as RecordType

FROM #sampleXML sx

CROSS APPLY x.nodes('Artists/Artist[@Type="Main"]') y(a)

Results:

1,000 rows:

-- without text() node, nvarchar(max):

SQL Server Execution Times:

CPU time = 359 ms, elapsed time = 758 ms.

-- with text() node, nvarchar(max):

SQL Server Execution Times:

CPU time = 374 ms, elapsed time = 601 ms.

--change nvarchar(max) to nvarchar(1000):

SQL Server Execution Times:

CPU time = 203 ms, elapsed time = 532 ms.

10,000 rows:

-- without text() node, nvarchar(max):

SQL Server Execution Times:

CPU time = 3260 ms, elapsed time = 7420 ms.

-- with text() node, nvarchar(max):

SQL Server Execution Times:

CPU time = 2403 ms, elapsed time = 5113 ms.

-- with text() node, nvarchar(1000):

SQL Server Execution Times:

CPU time = 1419 ms, elapsed time = 3209 ms.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001