October 5, 2011 at 12:56 pm
We have a parse XML to Int function, it is used in a couple of stored procedure process.
But I found the execution plan for this cost 99% in the excution of stored procedure.
How can we optimize this function?
CREATE FUNCTION [dbo].[ParseXmlToInt](@xml XML)
RETURNS TABLE
AS RETURN
(
select ids.id.value('@id', 'int') as id from @xml.nodes('//r') ids(id)
)
GO
Thanks
October 6, 2011 at 5:01 am
By specifying '//r' for the xpath, you say SQL server needs to traverse the entire document in search of any elements called 'r', anywhere in the document.
Do you know anything on the format of the documents you're passing in? for example if you know the document always contains 'r' elements as the root element(s) (the parameter is an xml fragment without a real root node, probably taken from a bigger document). Something like <r @id="1"/>
<r @id="2"/>. Then your function would only need to read these root elements, so the xquery search will be much faster. This would be the query in the function: Note the xpath
select ids.id.value('@id', 'int') as id
from @xml.nodes('/r') ids(id)
If the parameters' format has a root element called 'root', the query would be:
select ids.id.value('@id', 'int') as id
from @xml.nodes('/root/r') ids(id)
If there are many elements called 'root', but you only the r-elements from the first root, the query will be:
select ids.id.value('@id', 'int') as id
from @xml.nodes('/root[1]/r') ids(id)
The point is, think well about your xpath: be as specific you can be, so that sql server can find out which parts of the document it needs to read and which parts it can skip. Performance of xpath queries is all about scans and seeks, just like in 'normal' T-SQL.
October 6, 2011 at 9:14 am
Thank you.
The xml we have is like below format: Total 60000 records.
SET @EmpIDs='<ROOT>
<r id="5551189" />
<r id="5578554" />
<r id="5587894" />
<r id="5602958" />
<r id="5631882" />
<r id="5643028" />
<r id="5643069" />
<r id="5648894" />
<r id="5648969" />
<r id="5650411" />
......
<r id="8015231" />
</ROOT>'
And we are currently use
select ids.id.value('@id', 'int') as id from @xml.nodes('//r') ids(id)
Is that correct?
It looks like we should use below from your previous post, will this improve performance
select ids.id.value('@id', 'int') as id
from @xml.nodes('/root/r') ids(id)
Thanks
October 6, 2011 at 9:26 am
sqlfriends (10/6/2011)
It looks like we should use below from your previous post, will this improve performanceselect ids.id.value('@id', 'int') as id
from @xml.nodes('/root/r') ids(id)
XML is case sensitive, so that should be
select ids.id.value('@id', 'int') as id
from @xml.nodes('/ROOT/r') ids(id)
in order to match your sample data.
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
October 6, 2011 at 10:07 am
I've created an XML document as per your sample above, and used copy and paste to extend it to 80,000 entries. On my machine, your function parses it and returns a 80,000 row table in 881ms.
I would say that's quite reasonable performance, what sort of performance would you find acceptable?
October 6, 2011 at 10:14 am
..and half that time is just writing the output to the screen.
Setting "discard results after execution" results in an execution time of 420ms.
October 7, 2011 at 9:16 am
paul_ramster (10/6/2011)
I've created an XML document as per your sample above, and used copy and paste to extend it to 80,000 entries. On my machine, your function parses it and returns a 80,000 row table in 881ms.I would say that's quite reasonable performance, what sort of performance would you find acceptable?
Thanks,
We have a stored procedure that uses the function, and from the excution plan of running the stored procedure, which total takes 40 seconds, in the excution plan of the stored procedure, I can see 99% cost is on the function for reading XML into a table.
That's why I think this function needs to be improved for performance.
October 7, 2011 at 9:23 am
That means that the optimizer estimated the cost of the function as 99% of the total cost of the query, which is not quite the same thing as it actually being 99% of the cost in practice.
I think my previous test suggests that the optimizer is not getting the cost estimate correct for your query. It's quite likely that something else in your query is taking the time. Could be some outdated stats, or skewed distribution in the data or something, but we'd need more details to see if that is true.
A copy of the execution plan XML for starters.
October 7, 2011 at 9:33 am
I run the sproc again, and include the actual excution plan in the attachment.
Thanks
October 9, 2011 at 10:28 pm
Thanks for the attempt, but that is not the execution plan. That is a picture of a part of the execution plan. Please have a look at this article http://www.sqlservercentral.com/articles/SQLServerCentral/66909/ on how to get and post your query's execution plan and try again.
Viewing 10 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply