Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

XML SHREDDING Expand / Collapse
Author
Message
Posted Tuesday, October 16, 2012 3:30 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, June 26, 2014 3:16 PM
Points: 6, Visits: 135
DECLARE @docHandle int
DECLARE @xmlDocument nvarchar(1000)
SET @xmlDocument =
N'
<hearingData>
<author>
<date>2012-09-12T12:41:22</date>
<companyName>TK Group</companyName>
<software>Audiogram System</software>
<softwareVersion>6.0.30</softwareVersion>
<fileVersion>1.02</fileVersion>
</author>
<employees>
<employee>
<id>000000000</id>
<name>
<last>Fake</last>
<first>Name</first>
<middle>A</middle>
</name>
<status>active</status>
<employer>
<id>001</id>
<billingCode>0001</billingCode>
<siteCode/>
</employer>
<gender>F</gender>
<birthDate>1950-10-10</birthDate>
<hireDate>1999-10-31</hireDate>
<department>Pastry Prod</department>
<job>PRODUCTION HELPER</job>
<shift>1</shift>
<clockId>11111</clockId>
<ssn>000000000</ssn>
<testingInterval>12</testingInterval>
<protection>unspecified</protection>
<hearingTest>
<testDate>2012-08-23T14:26:53</testDate>
<tester>
<name>
<last>K</last>
<first>Sandeep</first>
</name>
<certificationId>54698</certificationId>
</tester>
<audiometer>
<serialNumber>920554</serialNumber>
<dateCalibrated>2011-04-08</dateCalibrated>
<make>Tremetrics</make>
<model>RA 600</model>
</audiometer>
<reason>annual</reason>
<lapse>less_than_14_hours</lapse>
<protectionUsed>true</protectionUsed>
<thresholds>
<left500>15</left500>
<left1K>25</left1K>
<left2K>25</left2K>
<left3K>25</left3K>
<left4K>25</left4K>
<left6K>25</left6K>
<left8K>45</left8K>
<right500>15</right500>
<right1K>15</right1K>
<right2K>15</right2K>
<right3K>25</right3K>
<right4K>25</right4K>
<right6K>20</right6K>
<right8K>40</right8K>
</thresholds>
<otoscopics>
<performed>NA</performed>
<blockage>NA</blockage>
<drainage>NA</drainage>
<other>NA</other>
</otoscopics>
<medicalQuestions>
<earPain>none</earPain>
<draining>none</draining>
<dizziness>false</dizziness>
<ringing>none</ringing>
<suddenLoss>none</suddenLoss>
<fluctuatingLoss>none</fluctuatingLoss>
<fullness>none</fullness>
<protectionProblem>none</protectionProblem>
<drugs>false</drugs>
<highBloodPressure>false</highBloodPressure>
<seenMd>none</seenMd>
<surgery>none</surgery>
<unconsciousness>false</unconsciousness>
<hearingAid>none</hearingAid>
<mumps>false</mumps>
<scarletFever>false</scarletFever>
<measles>false</measles>
<meningitis>false</meningitis>
<diabetes>false</diabetes>
<kidneyDisease>false</kidneyDisease>
<allergies>false</allergies>
<familyLoss>false</familyLoss>
<highNoise>false</highNoise>
<noProtection>false</noProtection>
<headCold>false</headCold>
<military>false</military>
<noisyHobbies>false</noisyHobbies>
<loudMusic>false</loudMusic>
<guns>false</guns>
</medicalQuestions>
<hearingConclusion>
<leftStsConclusion>no_change</leftStsConclusion>
<rightStsConclusion>no_change</rightStsConclusion>
<leftRecordableConclusion>no_change</leftRecordableConclusion>
<rightRecordableConclusion>no_change</rightRecordableConclusion>
<medicalConclusion>
<conclusion>normal</conclusion>
</medicalConclusion>
</hearingConclusion>
</hearingTest>
</employee>
</employees>
</hearingData>'

--Create an internal representation of the XML document.
EXEC sp_xml_preparedocument @docHandle OUTPUT, @xmlDocument
-- Execute a SELECT statement using OPENXML rowset provider.
SELECT *
FROM OPENXML (@docHandle, N'/hearingdata/employees/employee',2)
WITH (employeeID nvarchar(50) '/hearingdata/employees/employee/id',
EmployeeLName nvarchar(50) '/hearingdata/employees/employee/id/name/last',
EmployheareeFName nvarchar(50) '/hearingdata/employees/employee/id/name/first',
EmployeeMName nvarchar(50) '/hearingdata/employees/employee/id/name/middle',
EmployeeStatus nvarchar(10) '/hearingdata/employees/employee/status',
EmployerID nvarchar(20) '/hearingdata/employees/employee/employer/id',
EmployerBillingCode nvarchar(20) '/hearingdata/employees/employee/employer/billingcode',
EmployerSiteCode nvarchar(20) '/hearingdata/employees/employee/employer/sitecode'
)
EXEC sp_xml_removedocument @docHandle
I wanted to shred above file using TSQL .I tried SSIS. It returned a bunch of errors:

The XML parse error 0xc00ce55e occurred on line number 32, near the XML text " <c".
Msg 6602, Level 16, State 2, Procedure sp_xml_preparedocument, Line 1
The error description is 'Element was not closed.'.
Msg 8179, Level 16, State 5, Line 127
Could not find prepared statement with handle 0.
Msg 6607, Level 16, State 3, Procedure sp_xml_removedocument, Line 1
sp_xml_removedocument: The value supplied for parameter number 1 is invalid.

How to Solve this?Thanks for your time and help.

Post #1373549
Posted Tuesday, October 16, 2012 3:35 PM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Friday, September 19, 2014 2:29 PM
Points: 1,483, Visits: 1,031
From what you posted you have @xmlDocument declared as nvarchar(1000) but your string is 2997 chars long, so its getting truncated.

after increasing the length. It runs. but returns no Rows
Post #1373551
Posted Tuesday, October 16, 2012 3:39 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, June 26, 2014 3:16 PM
Points: 6, Visits: 135
@RAY M

Thanks for a quick reply. I modified it to varchar(max) .It gave me a table output but doesn't have data in it!
Post #1373555
Posted Tuesday, October 16, 2012 3:45 PM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Friday, September 19, 2014 2:29 PM
Points: 1,483, Visits: 1,031
yeah, you need to fenagle with the path, I'll check it out reall quick.

Updated sql select
SELECT *
FROM OPENXML (@docHandle, N'/hearingData/employees/employee',2)
WITH (employeeID nvarchar(50) 'id'
,EmployeeLName nvarchar(50) 'name/last'
,EmployheareeFName nvarchar(50) 'name/first'
,EmployeeMName nvarchar(50) 'name/middle'
,EmployeeStatus nvarchar(10) 'status'
,EmployerID nvarchar(20) 'employer/id'
,EmployerBillingCode nvarchar(20) 'employer/billingCode'
,EmployerSiteCode nvarchar(20) 'employer/siteCode'
)
the xml nodes are case sensitive, billingCode, and siteCode were not exactly right.
Post #1373557
Posted Tuesday, October 16, 2012 3:59 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, June 26, 2014 3:16 PM
Points: 6, Visits: 135
That works exactly.Thank you so much. I am on this issue for around 4 hours now. Next time I will come directly to the forums. :D
Post #1373560
Posted Tuesday, October 16, 2012 4:03 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 11:00 AM
Points: 35,276, Visits: 31,768
Hmmm.... that reminds me. I hate trying to science out all the paths necessary for a full return of an XML document. I made a stored procedure that will not only do it for me but prints the XML shredding code. I'll see if I can find it when I get home.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1373562
Posted Tuesday, October 16, 2012 4:44 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, June 26, 2014 3:16 PM
Points: 6, Visits: 135
Thanks Jeff. The automated process will be of great help ,in case you find it.
Post #1373577
Posted Tuesday, October 16, 2012 10:41 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 11:00 AM
Points: 35,276, Visits: 31,768
I found it. I had actually started to write an article about it to help others like me that know squat about XML and really don't like using it (actually, I have a deep rooted personal hatred for it at many levels), but I got Shanghied by other priorities.

To be sure, I don't claim that the code will work on all XML simply because I don't know enough about XML to say so.

Details are in the comments. The code is ready to run as is.

/*
References:
Look for Stan Kulp articles on the subject.
http://stackoverflow.com/questions/61233/the-best-way-to-shred-xml-data-into-sql-server-database-columns
http://architectshack.com/ClrXmlShredder.ashx

http://blogs.msdn.com/b/simonince/archive/2009/04/24/flattening-xml-data-in-sql-server.aspx


MIXED XML
http://labs.adobe.com/technologies/spry/samples/data_region/NestedXMLDataSample.html

*/

-------------------------------------------------------------------------------
--=============================================================================
-- Build some test data for this code review. This is NOT a part of the
-- solution. The @pXML variable could be a parameter in a stored proc.
--=============================================================================
--===== This will contain the actual XML document to be shredded and flattened.
DECLARE @pXML XML;

--===== This is an "Attribute Based" example.
-- "FirstName1" and "Zip2" attributes and the "Phone" entity for
-- "PhoneNumber6" have intentionally been left out to demonstrate that the
-- XML DOESN'T have to be "perfect" meaning that not every "entity" needs
-- to have all the attributes. It still needs to be "well formed", though.
SET @pXML = '
<AccountDetailsRsp AccountNum="1" AccountStatus="AccountStatus1">
<PlayerInfo PlayerID="1" LastName="LastName1">
<AddressList>
<PlayerAddress>
<Address AddressType="primary" City="City1" State="State1" Zip="Zip1"/>
<FutureUse />
<Phone PhoneNumber="PhoneNumber1" PhoneType="Type1" />
<Phone PhoneNumber="PhoneNumber2" PhoneType="Type2" />
</PlayerAddress>
<PlayerAddress>
<Address AddressType="billing" State="State2" City="City2" />
<FutureUse />
<Phone PhoneNumber="PhoneNumber3" PhoneType="Type3" />
</PlayerAddress>
</AddressList>
</PlayerInfo>
<PlayerInfo PlayerID="2" FirstName="FirstName2" LastName="LastName2">
<AddressList>
<PlayerAddress>
<Address AddressType="primary" City="City3" State="State3" Zip="Zip3"/>
<FutureUse />
<Phone PhoneNumber="PhoneNumber4" PhoneType="Type4" />
<Phone PhoneNumber="PhoneNumber5" PhoneType="Type5" />
</PlayerAddress>
<PlayerAddress>
<Address AddressType="billing" Zip="Zip4" State="State4" City="City4" />
<FutureUse />
</PlayerAddress>
</AddressList>
</PlayerInfo>
</AccountDetailsRsp>'
;
--===== This is an "Element Based" example. Uncomment it to see the code work
-- with it instead of the "Attribute Based" example.
-- "Skills" have been left out for "Simon" and "Age" has been left out
-- for "Sally" to demonstrate that the XML DOESN'T have to be perfect
-- meaning that not every "entity" needs to have all the attributes.
-- It still needs to be "well formed", though.
--SET @pXML = '
--<Root>
-- <Person>
-- <Name>Peter</Name>
-- <Age>21</Age>
-- <Skills>
-- <Skill>Cooking</Skill>
-- <Skill>Carpentry</Skill>
-- <Skill>Hunting</Skill>
-- </Skills>
-- </Person>
-- <Person>
-- <Name>Sally</Name>
-- <Age />
-- <Skills>
-- <Skill>Cooking</Skill>
-- <Skill>Carpentry</Skill>
-- </Skills>
-- </Person>
--</Root>
--'
--;
--===== This is an "Hybrid Based" example. Uncomment it to see the code work
-- with it instead of the other examples.
SET @pXML = '
<items>
<item id="0001" type="Donut">
<name>Cake</name>
<ppu>0.55</ppu>
<batter id="1001">Regular</batter>
<batter id="1002">Chocolate</batter>
<batter id="1003">Blueberry</batter>
<topping id="5001">None</topping>
<topping id="5002">Glazed</topping>
<topping id="5005">Sugar</topping>
<topping id="5006">Sprinkles</topping>
<topping id="5003">Chocolate</topping>
<topping id="5004">Maple</topping>
</item>
</items>
'
;

--------------------------------------------------------------------------------
;
/******************************************************************************
The following code is what I built to automatically flatten non-perfect but
still "well formed" "data" XML. It should handle "Attribute Based", "Element
Baseed", and "Hybrid" (combination of the two) XML. It not only automatically
creates a "flatted" result set from the XML, but it also creates and displays
the T-SQL that did the job so that you can easily tweak the code to meet exact
needs (such as changing the data-types or pointing it at a table, etc.

This does what I thought the whole purpose of what XML was actually was
supposed to be... the ability to make a flat result set without having to
manually "discover" and then write code to actually figure out what the XML
contains.

As a side bar, this method seems much more forgiving than using an XSD file to
use the data from the XML. Certainly, it seems easier than using an XSD file.
I could be wrong, though. This is really my first significant dive into XML.

Developer's Notes:
1. This code is "forgiving" in that not every "entity" within the XML needs to
have all the attributes that might be available throughout the XML. Unlike
other code of this nature, this code scans ALL of the XML for all entities
and all attributes. It even allows for duplicate names to occur in the
attributes (columns).
2. Despite Note 1 above, the XML falls well short of "artificial intelligence"
and the XML must still be "well formed". Tags must still have the
appropriate start and end tag where required, for example. Attributes
in "Attribute Based" XML must still be properly nested in the "Entity" tag
and the attribute value assignments must still be properly formed.
3. I'm from the U.S.A. and I have no current requirements to use NVARCHAR.
To wit, if you need the ability to use UniCode in your XML, then you'll
need to modify this code to change all VARCHARs to NVARCHAR (including any
string constants and variables), CHAR(10) to NCHAR(10, and the number
"8000" to "4000" (there's only one place where that happens).
4. Although I believe in the occasional need for case-sensitive columns, I
don't believe in server default case-sensitive collations. As a result, be
aware that I made no attempt to make this code case-INsensitive (although I
believe that the cases match in this code because of the great job someone
at Microsoft did on sp_xml_preparedocument).
5. With a bit of forethought, the code could be modified to automatically
generate individual appropriately named and keyed tables based on the name
of each "Entity" contained in the XML. I didn't take the time to do that
for two reasons... 1) Most people are just happy to flatten the XML and
get the flat result set just by issuing one command and 2) this exercise
has already turned my brain into mud because I actually had to work with
XML and I hate XML. I have to wash my mouth out every time I say it.

Rev 00 - 14 Jul 2012 - Jeff Moden - Initial creation and unit test.
******************************************************************************/
--=============================================================================
-- Presets
--=============================================================================
--===== Suppress the auto-display of rowcounts for appearances in this case.
SET NOCOUNT ON;

--===== Conditionally drop the working tables to make reruns in SSMS easier.
IF OBJECT_ID('tempdb..#Edge' ,'U') IS NOT NULL DROP TABLE #Edge;
IF OBJECT_ID('tempdb..#NodeInfo','U') IS NOT NULL DROP TABLE #NodeInfo;

--=============================================================================
-- Create a hierarchical table from the XML document which contains ALL of
-- the meta-data and data about the XML that we need.
--=============================================================================
--===== Create an obviously named variable to refer to the "prepared" XML
-- document.
DECLARE @DocHandle INT;

--===== Microsoft says that the following code "Creates an internal
-- representation of the document". What they don't tell you is that it's
-- not much more than a simple "Adjacency List" hierarchy table with a
-- couple of special columns added. Microsoft refers to this hierarchical
-- table as an "Edge" table. An "edge" is the line draw between two nodes
-- on something like an Org Chart.
EXEC sp_xml_preparedocument @DocHandle OUTPUT, @pXML;

--===== Copy the data from the implicit "edge" table of the XML into a Temp
-- Table where we can work on it without having to call OPENXML more
-- than once.
--
-- Note that the max width of "column names" (LocalName, etc) is expected
-- to be no more that 256 characters. You could set these to be much
-- larger but I'd also expect that means that someone got silly with the
-- XML itself.
SELECT ID = ISNULL(ID,0), --ISNULL makes a NOT NULL column so we can PK it.
ParentID,
NodeType, --1="Entity/Table", 2="Attribute/Column", 3="Data Node"
--4= a new node type to handle "Element Based" data.
ParentLocalName = CAST(NULL AS VARCHAR(256)),
LocalName = CAST(LocalName AS VARCHAR(256)),
Text,
HLevel = CAST(NULL AS INT),
Width = CAST(NULL AS VARCHAR(10))
INTO #Edge
FROM OPENXML (@DocHandle, '/') --This means "Prepare EVERYTHING" in the XML
;
--===== Remove the XML document to save on resources.
EXEC sp_xml_removedocument @DocHandle;

--===== This prevents multiple rescans similar to what you would find in a
-- "Triangular Join" which would make the UPDATE that follows this very
-- slow. Since it's a Temp Table, we'll let the machine create a unique
-- name for the constraint because constraints must be uniquely named and
-- we'll want to use this code concurrently in the future.
ALTER TABLE #Edge
ADD PRIMARY KEY CLUSTERED (ID)
;
--=============================================================================
-- Add extra information to the "edge" table to more easily create other
-- simpler information later on.
--=============================================================================
--===== Build the hierarchical level column, get the name of the "parent" for
-- each row, and calculate the max width of each "column" of data. Most of
-- this is necessary to control the order in which we build the T-SQL to
-- successfully and fully shred the XML.
WITH
cteBuildPath AS
(--==== This is the "anchor" part of the recursive CTE
SELECT anchor.ID,
HLevel = 1,
LocalName,
ParentLocalName
FROM #Edge AS anchor
WHERE ParentID IS NULL
UNION ALL -------------------------------------------------------------------
--==== This is the "recursive" part of the CTE that adds 1 for each level
-- and carries the previous LocalName forward as the ParentLocalName for
-- the next level of rows.
SELECT recur.ID,
HLevel = cte.HLevel+1,
LocalName = recur.LocalName,
ParentLocalName = cte.LocalName
FROM #Edge AS recur
INNER JOIN cteBuildPath AS cte
ON cte.ID = recur.ParentID
)--==== This does the actual update of the "edge" table from above.
-- It also calculates the column width of the data so we can modify
-- the resulting T-SQL later to make it so not every column has to
-- be a TEXT or NVARCHAR(MAX) column.
UPDATE tgt
SET tgt.HLevel = bp.HLevel,
tgt.ParentLocalName = bp.ParentLocalName,
tgt.Width = DATALENGTH(tgt.Text)
FROM #Edge tgt
INNER JOIN cteBuildPath bp
ON tgt.ID = bp.ID
;
--=============================================================================
-- Create the "node information table" which will ultimately be used to
-- build the T-SQL to read and return the XML data as a "flat" result set.
--=============================================================================
--===== This isolates "Entity" and "Attribute" node types and provides the
-- necessary table aliases. It also keeps us from having to do any more
-- large scans on the temporary edge table, which could actually be
-- dropped at this time to conserve resources.
--
-- Note that the ISNULL gets rid of the Null aggregate WARNING message
-- and speeds things up a bit.
SELECT RowNum = IDENTITY(INT,0,1), --Just a "uniquifier" for a PK.
NodeType, HLevel, ParentLocalName, LocalName,
DeDupeLocalName = CAST(NULL AS VARCHAR(256)),
Alias = CASE --These are "derived table" aliases we need in the T-SQL.
WHEN NodeType = 1
THEN 'x' + CAST(ROW_NUMBER() OVER
(ORDER BY NodeType,HLevel) AS VARCHAR(10))
ELSE ''
END,
Width = MAX(ISNULL(Width,0))
INTO #NodeInfo
FROM #Edge
GROUP BY NodeType, HLevel, ParentLocalName, LocalName
;
--===== Add some indexes just in case there are a lot of "columns" to process.
-- A lower fill factor is used because of the upcoming data mods in the
-- rest of the code to reduce costly page and extent splits when there
-- are a lot of "columns". If you regularly deal with such large XML,
-- you might want to reduce the FILLFACTOR even more. Still, even if you
-- don't change it at all, this is going to be a whole lot faster than
-- trying to write code to flatten the XML on your own.
ALTER TABLE #NodeInfo
ADD PRIMARY KEY CLUSTERED (RowNum) --Let the system name it on Temp Tables.
WITH FILLFACTOR = 80
;
CREATE UNIQUE INDEX IX_#NodeInfo_NodeType_RowNum
ON #NodeInfo (NodeType, RowNum)
WITH FILLFACTOR = 80
;
--=============================================================================
-- Build the FROM clause of the T-SQL. We need to do this first because
-- it works the same way for both "Attribute Based" and "Element Based"
-- XML. We'll actually need to modify the node data later on to build the
-- SELECT clause because the two different types of XML have different
-- requirements when it comes to node types. "Attribute Based" XML has
-- "Type 2 (column) nodes behind the scenes where "Element Based" XML
-- does not. "Hybrid Based" will originally look like "Attribute Based".
--=============================================================================
--===== This creates the FROM and OUTER APPLYs using the essential aliases for
-- each "entity" (Type 1 node) in the XML.
-- This works as is for both attribute based and non-attibute based XML.
DECLARE @From VARCHAR(MAX);
SELECT @From = '';

SELECT @From = @From
+ ISNULL(' OUTER APPLY ' + ParentLocalName, ' FROM @pXML')
+ '.nodes (' + QUOTENAME(LocalName,'''') + ') '
+ Alias + ' '
+ '(' + LocalName + ')' + CHAR(10)
FROM #NodeInfo
WHERE NodeType = 1
;
--=============================================================================
-- If the XML is "Element Based" rather than "Attribute Based", there are
-- no "Type 2 (column) nodes. In order to make the SELECT clause for both
-- in a common fashion, we need to modify the node information table so
-- that it looks like the data is "Attribute Based".
--=============================================================================
--===== We need to add aliases to any Type 2 Nodes that don't already have them
-- so we can correctly build the SELECT list.
UPDATE tgt
SET tgt.Alias = src.Alias
FROM #NodeInfo tgt
INNER JOIN #NodeInfo src
ON tgt.ParentLocalName = src.LocalName
WHERE src.NodeType = 1 --Entity Node
AND tgt.NodeType = 2 --Column Node
;
--===== If the XML isn't attribute based, there will be no Type 2 nodes.
-- Make the node info table look like there are except make them a new
-- (for this code) "Type 4" (Element Based Column). This is done really
-- easilysimply by seeing if the "LocalName" is present as a Type 3 data
-- node according to the "ParentLocalName". This is probably the only
-- place where duplicate Entity or Attribute names might cause a problem.
UPDATE #NodeInfo
SET NodeType = 4
WHERE NodeType = 1
AND LocalName IN (SELECT ParentLocalName FROM #NodeInfo WHERE NodeType = 3)
;
--===== Now we need to resolve any duplicate column names so that we're not
-- confused by the output if dupes exist.
UPDATE #NodeInfo
SET DeDupeLocalName = ParentLocalName + '_' + LocalName
WHERE LocalName IN
(
SELECT LocalName
FROM #NodeInfo
WHERE LocalName <> '#text' --Only data nodes have this LocalName
GROUP BY LocalName
HAVING COUNT(*) > 1
)
;
--=============================================================================
-- Now that the node information looks like it's "Attribute Based", copy
-- the "column width" information from the Type 3 (data) nodes to the
-- Type 2 (column) and Type 4 (element column) nodes to make the final
-- dynamic SQL for the SELECT clause a whole lot easier to build and
-- understand.
--
-- Notice that we can also easily handle the MAX datatype here because
-- we stored the WIDTH as a VARCHAR earlier.
--=============================================================================
--===== Copy the column Width from the data nodes to the column nodes
UPDATE tgt
SET tgt.Width = CASE
WHEN src.Width <= 8000
THEN src.Width
ELSE 'MAX'
END
FROM #NodeInfo tgt
INNER JOIN #NodeInfo src
ON tgt.LocalName = src.ParentLocalName
WHERE tgt.NodeType IN (2,4)
AND src.NodeType = 3
;
--=============================================================================
-- That's it. Whether the XML is "Attribute Based", "Element Based" or
-- a "Hybrid" of the two, we now have enough information to build the
-- SELECT list for both types of XML with some minor tweaks by CASE
-- statements as to which data to use.
--
-- Although ORDER isn't really important here, it will give you better
-- hints as to what's actually in the XML and the order that it's in for
-- troubleshooting purposes.
--=============================================================================
--===== This creates the SELECT list
DECLARE @Select VARCHAR(MAX);

SELECT @Select = ISNULL(@Select + ',' + CHAR(10)+ SPACE(8), '')
+ ISNULL(DeDupeLocalName,LocalName) + ' = '
+ CASE
WHEN NodeType = 2 --Node Type 4 implied for the ELSE by WHERE
THEN Alias + '.' + ParentLocalName + '.value '
ELSE Alias + '.' + LocalName + '.value '
END
+ CASE
WHEN NodeType = 2 --Node Type 4 implied for the ELSE by WHERE
THEN '(' + QUOTENAME('@' + LocalName ,'''')
+ ', ''VARCHAR(' + Width + ')'')'
ELSE '(''(text())[1]'', ''VARCHAR(' + Width + ')'')'
END
FROM #NodeInfo
WHERE NodeType IN (2,4)
ORDER BY HLevel, Alias --Simplifies troubleshooting if needed
;
--===== Put the final SQL all together along with some helpful hints for use.
DECLARE @SQL NVARCHAR(MAX);
SELECT @SQL = '
/*=============================================================================
The following T-SQL is what shredded the XML you gave it.

If desired, please modify the code to change the datatypes to suit your needs
and maybe add an INTO #SomeTableName just before the FROM clause to
"auto-magically build and store the results into a temporary or other table as
a "staging" table on-the-fly to work from.
--Jeff Moden
=============================================================================*/
'
+ ' SELECT RowNum = ROW_NUMBER() ' --Ensure unique rows
+ 'OVER(ORDER BY (SELECT NULL)),' + CHAR(10) + SPACE(8)
+ @Select + CHAR(10)
+ @From
+ ';' + '
--=============================================================================
-- Hint on how to convert the T-SQL to work on a table column.
--=============================================================================
/*
Change the FROM clause in the SQL above in a similar manner to below to shred
an entire column of similar (ie: same nodes/structure) XML.

Change from: (Note that "... etc x" means "the rest of the line of existing code.")

FROM @pXML.nodes (... etc 1
OUTER APPLY ... etc 2
OUTER APPLY ... etc 3
... etc x

Change to:

FROM dbo.YourTableName x0
CROSS APPLY YourXMLColumnName.nodes( ... etc 1
OUTER APPLY ... etc 2
OUTER APPLY ... etc 3
... etc x
*/
'
;
--===== Display the SQL in the Messages tab so we can reuse it or modify it
-- to become production code later on.
PRINT @SQL;
SELECT [NOTICE!!!] = 'Please see "Messages" tab for the rendered T-SQL'
UNION ALL
SELECT [NOTICE!!!] = 'that created the "table" below from the XML.'
;
-- SELECT * FROM #Edge
-- SELECT * FROM #NodeInfo

--===== Show that the SQL works as advertised and so that we can examine the
-- results to tweak the generated SQL for datatypes, etc, later.
EXECUTE sp_executesql @SQL,
N'@pXML XML',
@pXML
;

--select * from #Edge



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1373618
Posted Wednesday, October 17, 2012 8:47 AM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Friday, September 19, 2014 2:29 PM
Points: 1,483, Visits: 1,031
Well Done Mr. Moden
Post #1373859
Posted Wednesday, October 17, 2012 8:52 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, June 26, 2014 3:16 PM
Points: 6, Visits: 135
@Thanks Jeff, will try this
Post #1373866
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse