RE: XQuery error – SQLServerCentral

SSC Guru

Points: 61136

April 25, 2017 at 8:07 am

#1939758

Mikael Eriksson SE - Monday, April 24, 2017 11:34 PM
Using the parent axis when extracting values from XML is not good for performance.
Use an extra cross apply instead.
select P.X.value('(PackID/text())[1]','int') as PackID, I.X.value('(ItemNumber/text())[1]','int') as ItemNumber, I.X.value('(Qty/text())[1]','int') as Qty from @packXML.nodes('/Pack') as P(X) cross apply P.X.nodes('Item') as I(X);
For more info on the problem with the parent axis you can have a look at this question on StackOverflow:
Cross apply xml query performs exponentially worse as xml document grows
.

Well done Mikael. I would have thought that the extra APPLY would increase the overhead but you are 100% right - the parent axis is a killer. I looked at the link you posted which includes some good info but also did a quick performance test and was blown away at the difference.

IF OBJECT_ID('tempdb..#tmp') IS NOT NULL DROP TABLE #tmp; CREATE TABLE #tmp (packkey int identity primary key, packXML XML NOT NULL);

INSERT #tmp(packXML) SELECT TOP (100000) '<Pack> <PackID>'+ CAST(abs(checksum(newid())%10) AS varchar(2)) +'</PackID> <Item> <ItemNumber>1</ItemNumber> <Qty>'+ CAST(abs(checksum(newid())%20)+1 AS varchar(2)) +'</Qty> </Item> </Pack> <Pack> <PackID>2</PackID> <Item> <ItemNumber>1</ItemNumber> <Qty>'+ CAST(abs(checksum(newid())%20)+1 AS varchar(2)) +'</Qty> </Item> <Item> <ItemNumber>2</ItemNumber> <Qty>'+ CAST(abs(checksum(newid())%90)+1 AS varchar(2)) +'</Qty> </Item> </Pack>' FROM sys.all_columns a, sys.all_columns b;

SET NOCOUNT ON; IF OBJECT_ID('tempdb..#dump1') IS NOT NULL DROP TABLE #dump1; IF OBJECT_ID('tempdb..#dump2') IS NOT NULL DROP TABLE #dump2; GO

SET STATISTICS IO ON; SET STATISTICS TIME ON; SELECT packkey, N.value('(../PackID/text())[1]', 'int') as PackID, N.value('(ItemNumber/text())[1]','int') as ItemNumber, N.value('(Qty/text())[1]','int') as Qty INTO #dump1 FROM #tmp CROSS APPLY packXML.nodes('Pack/Item') as T(N);

SELECT packkey, P.X.value('(PackID/text())[1]','int') as PackID, I.X.value('(ItemNumber/text())[1]','int') as ItemNumber, I.X.value('(Qty/text())[1]','int') as Qty INTO #dump2 FROM #tmp CROSS APPLY packXML.nodes('/Pack') as P(X) CROSS APPLY P.X.nodes('Item') as I(X); SET STATISTICS TIME OFF; SET STATISTICS IO OFF; GO

DROP TABLE #dump1; DROP TABLE #dump2; DROP TABLE #tmp;

Results

Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#tmp_0000000002B3'. Scan count 3, logical reads 2278, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 25422 ms, elapsed time = 13259 ms.
Table '#tmp_0000000002B3'. Scan count 3, logical reads 2278, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 6313 ms, elapsed time = 3205 ms.

I learned something new today before 9AM.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001