SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SHREDDING XML - Can it be done quickly?


SHREDDING XML - Can it be done quickly?

Author
Message
sgmunson
sgmunson
SSC-Forever
SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)

Group: General Forum Members
Points: 43700 Visits: 5422
I have the following combination of code, for which the declared xml variable isn't something I can influence. However, looking into the stored procedure it's a parameter to, I'm finding a SELECT that when you take out all the other joins and selected columns, what's left is the SELECT in the code below. Running this separately in SSMS took 21 seconds, for one lousy row. Execution plan uses a table valued function XML reader for each column. There has GOT to be a better (and hopefully, significantly faster) way to do this. As I'm not real sure even what to go Googling for, I could use a little help in the shredding xml quickly department. Here's the code:
DECLARE @p2 AS xml =
CONVERT(xml,
N'<ROOT><R ID="24591988" PrimID="1003381" AuthID="1157388" ProgID="25242" ServiceCodeID="647" WorkerID="31522" Date="2017-12-01T00:00:00" StartTime="" EndTime="" RequestID="0" GPermID="0" PayRateOverride="" AutoScheduled="false"/></ROOT>'
);

SELECT t.col.value('@ID','BIGINT') AS ID,
t.col.value('@PrimID','INT') AS PrimID,
t.col.value('@AuthID','INT') AS AuthID,
t.col.value('@ProgID','INT') AS ProgID,
t.col.value('@ServiceCodeID','INT') AS ServiceCodeID,
t.col.value('@WorkerID','INT') AS WorkerID,
t.col.value('@Date','DATETIME') AS [Date],
t.col.value('@StartTime','DATETIME') AS StartTime,
t.col.value('@EndTime','DATETIME') AS EndTime,
t.col.value('@RequestID','INT') AS RequestID,
t.col.value('@GPermID','INT') AS GPermID,
t.col.value('@PayRateOverride','VARCHAR(30)') AS PayRateOverride,
t.col.value('@AutoScheduled','BIT') AS AutoScheduled
FROM @p2.nodes('/ROOT/R') AS t(col)



Steve
‌(aka sgmunson)
Smile Smile Smile
Health & Nutrition
Luis Cazares
Luis Cazares
SSC Guru
SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)SSC Guru (96K reputation)

Group: General Forum Members
Points: 96490 Visits: 21217
I'm not sure what's the problem as I get those results immediately in my laptop. A different option that we use here is using OPENXML which requires sp_xml_preparedocument and sp_xml_removedocument.


DECLARE @iDoc int;
--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @iDoc OUTPUT, @p2;

SELECT *
FROM OPENXML (@iDoc, '/ROOT/R', 1)
WITH (
ID BIGINT,
PrimID INT,
AuthID INT,
ProgID INT,
ServiceCodeID INT,
WorkerID INT,
Date DATETIME,
StartTime DATETIME,
EndTime DATETIME,
RequestID INT,
GPermID INT,
PayRateOverride VARCHAR(30),
AutoScheduled BIT);

--Release XML document
EXEC sp_xml_removedocument @iDoc;



Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Alan Burstein
Alan Burstein
SSC-Dedicated
SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)SSC-Dedicated (32K reputation)

Group: General Forum Members
Points: 32822 Visits: 8582
This is running much slower for you than it should. Your query is instantaneous on my laptop as well. I even created 10K rows of sample data and tested it. This query takes a couple seconds on my laptop:

declare @table table (someXML xml);
insert @table
select top (10000) replace(cast(@p2 as varchar(1000)),'1003381', user_type_id)
from sys.all_columns;

SELECT t.col.value('@ID','BIGINT') AS ID,
t.col.value('@PrimID','INT') AS PrimID,
t.col.value('@AuthID','INT') AS AuthID,
t.col.value('@ProgID','INT') AS ProgID,
t.col.value('@ServiceCodeID','INT') AS ServiceCodeID,
t.col.value('@WorkerID','INT') AS WorkerID,
t.col.value('@Date','DATETIME') AS [Date],
t.col.value('@StartTime','DATETIME') AS StartTime,
t.col.value('@EndTime','DATETIME') AS EndTime,
t.col.value('@RequestID','INT') AS RequestID,
t.col.value('@GPermID','INT') AS GPermID,
t.col.value('@PayRateOverride','VARCHAR(30)') AS PayRateOverride,
t.col.value('@AutoScheduled','BIT') AS AutoScheduled
FROM @table x
cross apply x.someXML.nodes('/ROOT/R') AS t(col);


T-SQL XML queries against XML are going to be slower than ones against tables no matter what but not 21 seconds a row. As Luis mentioned sp_xml_preparedocument, I too, have had some success with that.




-- Alan Burstein


Helpful links:

Best practices for getting help on SQLServerCentral -- Jeff Moden
How to Post Performance Problems -- Gail Shaw

Nasty fast set-based string manipulation functions:
For splitting strings try DelimitedSplit8K or DelimitedSplit8K_LEAD (SQL Server 2012+)
To split strings based on patterns try PatternSplitCM
Need to clean or transform a string? try NGrams, PatExclude8K, PatReplace8K, DigitsOnlyEE, or Translate8K

I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code. -- Itzik Ben-Gan 2001

sgmunson
sgmunson
SSC-Forever
SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)SSC-Forever (43K reputation)

Group: General Forum Members
Points: 43700 Visits: 5422
Thanks Luis. As it turns out, it had to be either the server, or my copy of SSMS. One or the other was mucked up. I later ended up crashing SSMS, and it's been working fine in about a quarter of a second (EDIT: for 1,000 executions) ever since. I tried using Jeff's DelimitedSplit8K at first, but the performance just wasn't there, and probably because I used the same split twice after splitting once to begin with. I hate testing on shared servers when performance testing is involved. You never know what you're up against...

I ended up just using that original boiled down XML shred going into a temp table. Turns out the doc prep execs cost an extra 0.175 seconds on 1,000 executions, so I'll have to not use that on this occasion. Thanks again for creating a sounding board.

Steve
‌(aka sgmunson)
Smile Smile Smile
Health & Nutrition
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search