Basics of XML and SQL Server, Part 2: Shredding XML

Question

Post reply

Basics of XML and SQL Server, Part 2: Shredding XML

Stan Kulp-439977

SSChampion

Points: 10062
More actions
March 13, 2012 at 11:47 pm

#252449

Comments posted to this topic are about the item Basics of XML and SQL Server, Part 2: Shredding XML

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply

ggareth SSCommitted Points: 1589 More actions · Answer 1

Nice article. Please consider the surface well and truly scratched!

adam.everett SSC Veteran Points: 265 More actions · Answer 2

I dont understand the point about the doc not being loaded into ram for xpath

taken from your xpath sample,

DECLARE @Doc XML

SET @Doc = (SELECT * FROM OPENROWSET(BULK 'C:\xml\StockMarketDataAsElements.xml',SINGLE_BLOB) AS x)

has this not loaded the doc into ram?

R.P.Rozema SSChampion Points: 12307 More actions · Answer 3

Nice sum up of the various possibilities indeed.

A remark: though I understand you had to keep each section brief because you wanted to cover all topics, I think you could have emphasized a little more on the use of xquery over the openrowset method, as xquery is mostly far superior in performance and memory use over the openrowset method. I found that most articles on the sql & xml subject tend to first list the 'outdated' methods and only after that get to the newer, easier methods. This results in many people giving up reading before they have actually gotten to the part where the easier methods are described. Your's is no exception, sadly.

Then, why is the section on converting attributes into elements included in your article? I've got 2 reasons for this question:

1 - the solution shown does not use SQL server to solve the problem, nor can it be automated, so it is often not usable for your target audience, sql server admins/developers, and

2 - SQL server processes data from attributes a lot faster than it processes elements. Even though I do agree that many xml documents use attributes for the wrong purposes (but that is for different reasons), if the document already has its data in attributes, then why not take advantage of this fact and have SQL server process the data from the attributes directly instead of doing all the pre-processing to convert them into elements and then have a bigger document that processes slower?

And a tip: You don't need an external tool to 'tidy' xml data: formatting xml data can easily be done from SSMS: While inside an opened xml document window, the 'Edit' menu contains the 'advanced' -> 'Format document (Ctrl-K, Ctrl-D)' option. This formats the entire xml document, also an option to format just the selected text is available.

A second tip: to put xml output by SSMS into a file many people copy from the Results tab and paste the xml text into some other editor, to save it from there. It is far more easy to right-click the column header (output into grid must be on for this, but that is required for almost anything when working with xml) and select 'Save Results As...'. In the file dialog that opens, first change 'Save as type' from the default 'CSV (Comma delimited)' into 'All files', then enter a file name (including the .xml extension) and press Save. This works well for much larger result files than can be copied using copy-paste.

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

R.P.Rozema SSChampion Points: 12307 More actions · Answer 4

Yet another tip: the use of OPENROWSET requires permissions many admins will not allow end-users to have on production machines as it opens up many security holes. There is however a way for the admin to setup sql server such that any user can only request the server to open a document the user can read himself, using his own credentials. Using kerberos constrained delegation any user can make the server read any data that he/she can read him/herself. This even works for having the server read the documents from a network share!. Using kerberos contrained delegation the server will never allow to open a document the user does not have access to already via 'normal' file access methods.

This article describes how to set up kerberos constrained delegation on SQL server 2008.

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

R.P.Rozema SSChampion Points: 12307 More actions · Answer 5

Looking forward for your article on XML DML! I've tried many things, but so far I failed to find any useful patterns to use the .modify() method.

My attempts so far with .modify() have been unfruitfull in applying it for anything else than trivial example code. So if you can include some examples on how to use it for example:

- to change the n-th element's value where n is not known before hand (i.e. avoid putting some constant like [1] in the xpath), or

- to change the value of an element by a value, depending on the current value in that element (again, dynamically, not by specifying something like [@e = "value1"] in the xpath.), or

- to change a value in an element in a location in a document that is is not statically known. For example when in the document the elements can be nested to any depth to form a tree and I need to modify only nodes in this tree that match specific criteria, or

- to apply .modify() repeatably on a single document to update a set of element's values in the document.

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

aschaffer Old Hand Points: 354 More actions · Answer 6

Sorry if I missed it, but a request that when offering a part X of Y series of articles, to please include links to the previous articles in the series...just makes it easier to find. I didn't see part 1 and would like to start there before reading part 2. Thanks.

R.P.Rozema SSChampion Points: 12307 More actions · Answer 7

aschaffer (3/14/2012)
Sorry if I missed it, but a request that when offering a part X of Y series of articles, to please include links to the previous articles in the series...just makes it easier to find. I didn't see part 1 and would like to start there before reading part 2. Thanks.

Here you are : Part 1[/url]

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

dalexmailbox Old Hand Points: 378 More actions · Answer 8

- to change the n-th element's value where n is not known before hand (i.e. avoid putting some constant like [1] in the xpath),

you should use sql:variable function like this:

declare @xml xml='<root><element>1</element><element>2</element></root>'

declare @n int=2

SET @xml.modify('replace value of (//root/element[position()=sql:variable("@n")]/text())[1] with "12"')

select @xml

Stan Kulp-439977 SSChampion Points: 10062 More actions · Answer 9

"1 - the solution shown does not use SQL server to solve the problem, nor can it be automated, so it is often not usable for your target audience, sql server admins/developers."

In Part 3 I show how to automate it.

R.P.Rozema SSChampion Points: 12307 More actions · Answer 10

Excellent! Looking forward for it. And thanks a lot for the series.

Posting Data Etiquette - Jeff Moden[/url]
Posting Performance Based Questions - Gail Shaw[/url]
Hidden RBAR - Jeff Moden[/url]
Cross Tabs and Pivots - Jeff Moden[/url]
Catch-all queries - Gail Shaw[/url]

If you don't have time to do it right, when will you have time to do it over?

Charles Daringer SSC Rookie Points: 45 More actions · Answer 11

I have a question, if you loaded the sql initially via stream not from the file system and wanted to perform a XSLT2.0 transformation within SQL Server from a procedure and then grab the result of that procedure via a stream could you do this?

Can the XSLT transformation be accomplished within SQL Server?

Stan Kulp-439977 SSChampion Points: 10062 More actions · Answer 12

The next article in this series will show you how to write an SSIS package that performs XSLT transformations.

It will be published next Wednesday.

Stan Kulp-439977 SSChampion Points: 10062 More actions · Answer 13

The first five articles in this series were accepted before any of them were published. I don't know the URL of a published article until it is actually published. Once an article is accepted, I can make no further edits, so I can't add the URL to the previous article.

What you can do is click on the author's name under the title of an article to see all the articles written by that author.

http://www.sqlservercentral.com/Authors/Articles/Stan_Kulp/439977/

aschaffer Old Hand Points: 354 More actions · Answer 14

Perhaps then the publishing entity could add such links when publishing. I've seen them before on other articles from SSC.