XML SHREDDING

  • DECLARE @docHandle int

    DECLARE @xmlDocument nvarchar(1000)

    SET @xmlDocument =

    N'

    <hearingData>

    <author>

    <date>2012-09-12T12:41:22</date>

    <companyName>TK Group</companyName>

    <software>Audiogram System</software>

    <softwareVersion>6.0.30</softwareVersion>

    <fileVersion>1.02</fileVersion>

    </author>

    <employees>

    <employee>

    <id>000000000</id>

    <name>

    <last>Fake</last>

    <first>Name</first>

    <middle>A</middle>

    </name>

    <status>active</status>

    <employer>

    <id>001</id>

    <billingCode>0001</billingCode>

    <siteCode/>

    </employer>

    <gender>F</gender>

    <birthDate>1950-10-10</birthDate>

    <hireDate>1999-10-31</hireDate>

    <department>Pastry Prod</department>

    <job>PRODUCTION HELPER</job>

    <shift>1</shift>

    <clockId>11111</clockId>

    <ssn>000000000</ssn>

    <testingInterval>12</testingInterval>

    <protection>unspecified</protection>

    <hearingTest>

    <testDate>2012-08-23T14:26:53</testDate>

    <tester>

    <name>

    <last>K</last>

    <first>Sandeep</first>

    </name>

    <certificationId>54698</certificationId>

    </tester>

    <audiometer>

    <serialNumber>920554</serialNumber>

    <dateCalibrated>2011-04-08</dateCalibrated>

    <make>Tremetrics</make>

    <model>RA 600</model>

    </audiometer>

    <reason>annual</reason>

    <lapse>less_than_14_hours</lapse>

    <protectionUsed>true</protectionUsed>

    <thresholds>

    <left500>15</left500>

    <left1K>25</left1K>

    <left2K>25</left2K>

    <left3K>25</left3K>

    <left4K>25</left4K>

    <left6K>25</left6K>

    <left8K>45</left8K>

    <right500>15</right500>

    <right1K>15</right1K>

    <right2K>15</right2K>

    <right3K>25</right3K>

    <right4K>25</right4K>

    <right6K>20</right6K>

    <right8K>40</right8K>

    </thresholds>

    <otoscopics>

    <performed>NA</performed>

    <blockage>NA</blockage>

    <drainage>NA</drainage>

    <other>NA</other>

    </otoscopics>

    <medicalQuestions>

    <earPain>none</earPain>

    <draining>none</draining>

    <dizziness>false</dizziness>

    <ringing>none</ringing>

    <suddenLoss>none</suddenLoss>

    <fluctuatingLoss>none</fluctuatingLoss>

    <fullness>none</fullness>

    <protectionProblem>none</protectionProblem>

    <drugs>false</drugs>

    <highBloodPressure>false</highBloodPressure>

    <seenMd>none</seenMd>

    <surgery>none</surgery>

    <unconsciousness>false</unconsciousness>

    <hearingAid>none</hearingAid>

    <mumps>false</mumps>

    <scarletFever>false</scarletFever>

    <measles>false</measles>

    <meningitis>false</meningitis>

    <diabetes>false</diabetes>

    <kidneyDisease>false</kidneyDisease>

    <allergies>false</allergies>

    <familyLoss>false</familyLoss>

    <highNoise>false</highNoise>

    <noProtection>false</noProtection>

    <headCold>false</headCold>

    <military>false</military>

    <noisyHobbies>false</noisyHobbies>

    <loudMusic>false</loudMusic>

    <guns>false</guns>

    </medicalQuestions>

    <hearingConclusion>

    <leftStsConclusion>no_change</leftStsConclusion>

    <rightStsConclusion>no_change</rightStsConclusion>

    <leftRecordableConclusion>no_change</leftRecordableConclusion>

    <rightRecordableConclusion>no_change</rightRecordableConclusion>

    <medicalConclusion>

    <conclusion>normal</conclusion>

    </medicalConclusion>

    </hearingConclusion>

    </hearingTest>

    </employee>

    </employees>

    </hearingData>'

    --Create an internal representation of the XML document.

    EXEC sp_xml_preparedocument @docHandle OUTPUT, @xmlDocument

    -- Execute a SELECT statement using OPENXML rowset provider.

    SELECT *

    FROM OPENXML (@docHandle, N'/hearingdata/employees/employee',2)

    WITH (employeeID nvarchar(50) '/hearingdata/employees/employee/id',

    EmployeeLName nvarchar(50) '/hearingdata/employees/employee/id/name/last',

    EmployheareeFName nvarchar(50) '/hearingdata/employees/employee/id/name/first',

    EmployeeMName nvarchar(50) '/hearingdata/employees/employee/id/name/middle',

    EmployeeStatus nvarchar(10) '/hearingdata/employees/employee/status',

    EmployerID nvarchar(20) '/hearingdata/employees/employee/employer/id',

    EmployerBillingCode nvarchar(20) '/hearingdata/employees/employee/employer/billingcode',

    EmployerSiteCode nvarchar(20) '/hearingdata/employees/employee/employer/sitecode'

    )

    EXEC sp_xml_removedocument @docHandle

    I wanted to shred above file using TSQL .I tried SSIS. It returned a bunch of errors:

    The XML parse error 0xc00ce55e occurred on line number 32, near the XML text " <c".

    Msg 6602, Level 16, State 2, Procedure sp_xml_preparedocument, Line 1

    The error description is 'Element was not closed.'.

    Msg 8179, Level 16, State 5, Line 127

    Could not find prepared statement with handle 0.

    Msg 6607, Level 16, State 3, Procedure sp_xml_removedocument, Line 1

    sp_xml_removedocument: The value supplied for parameter number 1 is invalid.

    How to Solve this?Thanks for your time and help.

  • From what you posted you have @xmlDocument declared as nvarchar(1000) but your string is 2997 chars long, so its getting truncated.

    after increasing the length. It runs. but returns no Rows

  • @ray M

    Thanks for a quick reply. I modified it to varchar(max) .It gave me a table output but doesn't have data in it!

  • yeah, you need to fenagle with the path, I'll check it out reall quick.

    Updated sql select

    SELECT *

    FROM OPENXML (@docHandle, N'/hearingData/employees/employee',2)

    WITH (employeeID nvarchar(50) 'id'

    ,EmployeeLName nvarchar(50) 'name/last'

    ,EmployheareeFName nvarchar(50) 'name/first'

    ,EmployeeMName nvarchar(50) 'name/middle'

    ,EmployeeStatus nvarchar(10) 'status'

    ,EmployerID nvarchar(20) 'employer/id'

    ,EmployerBillingCode nvarchar(20) 'employer/billingCode'

    ,EmployerSiteCode nvarchar(20) 'employer/siteCode'

    )

    the xml nodes are case sensitive, billingCode, and siteCode were not exactly right. 😉

  • That works exactly.Thank you so much. I am on this issue for around 4 hours now. Next time I will come directly to the forums. 😀

  • Hmmm.... that reminds me. I hate trying to science out all the paths necessary for a full return of an XML document. I made a stored procedure that will not only do it for me but prints the XML shredding code. I'll see if I can find it when I get home.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff. The automated process will be of great help ,in case you find it. 🙂

  • I found it. I had actually started to write an article about it to help others like me that know squat about XML and really don't like using it (actually, I have a deep rooted personal hatred for it at many levels), but I got Shanghied by other priorities.

    To be sure, I don't claim that the code will work on all XML simply because I don't know enough about XML to say so. :blush:

    Details are in the comments. The code is ready to run as is.

    /*

    References:

    Look for Stan Kulp articles on the subject.

    http://stackoverflow.com/questions/61233/the-best-way-to-shred-xml-data-into-sql-server-database-columns

    http://architectshack.com/ClrXmlShredder.ashx

    http://blogs.msdn.com/b/simonince/archive/2009/04/24/flattening-xml-data-in-sql-server.aspx

    MIXED XML

    http://labs.adobe.com/technologies/spry/samples/data_region/NestedXMLDataSample.html

    */

    -------------------------------------------------------------------------------

    --=============================================================================

    -- Build some test data for this code review. This is NOT a part of the

    -- solution. The @pXML variable could be a parameter in a stored proc.

    --=============================================================================

    --===== This will contain the actual XML document to be shredded and flattened.

    DECLARE @pXML XML;

    --===== This is an "Attribute Based" example.

    -- "FirstName1" and "Zip2" attributes and the "Phone" entity for

    -- "PhoneNumber6" have intentionally been left out to demonstrate that the

    -- XML DOESN'T have to be "perfect" meaning that not every "entity" needs

    -- to have all the attributes. It still needs to be "well formed", though.

    SET @pXML = '

    <AccountDetailsRsp AccountNum="1" AccountStatus="AccountStatus1">

    <PlayerInfo PlayerID="1" LastName="LastName1">

    <AddressList>

    <PlayerAddress>

    <Address AddressType="primary" City="City1" State="State1" Zip="Zip1"/>

    <FutureUse />

    <Phone PhoneNumber="PhoneNumber1" PhoneType="Type1" />

    <Phone PhoneNumber="PhoneNumber2" PhoneType="Type2" />

    </PlayerAddress>

    <PlayerAddress>

    <Address AddressType="billing" State="State2" City="City2" />

    <FutureUse />

    <Phone PhoneNumber="PhoneNumber3" PhoneType="Type3" />

    </PlayerAddress>

    </AddressList>

    </PlayerInfo>

    <PlayerInfo PlayerID="2" FirstName="FirstName2" LastName="LastName2">

    <AddressList>

    <PlayerAddress>

    <Address AddressType="primary" City="City3" State="State3" Zip="Zip3"/>

    <FutureUse />

    <Phone PhoneNumber="PhoneNumber4" PhoneType="Type4" />

    <Phone PhoneNumber="PhoneNumber5" PhoneType="Type5" />

    </PlayerAddress>

    <PlayerAddress>

    <Address AddressType="billing" Zip="Zip4" State="State4" City="City4" />

    <FutureUse />

    </PlayerAddress>

    </AddressList>

    </PlayerInfo>

    </AccountDetailsRsp>'

    ;

    --===== This is an "Element Based" example. Uncomment it to see the code work

    -- with it instead of the "Attribute Based" example.

    -- "Skills" have been left out for "Simon" and "Age" has been left out

    -- for "Sally" to demonstrate that the XML DOESN'T have to be perfect

    -- meaning that not every "entity" needs to have all the attributes.

    -- It still needs to be "well formed", though.

    --SET @pXML = '

    --<Root>

    -- <Person>

    -- <Name>Peter</Name>

    -- <Age>21</Age>

    -- <Skills>

    -- <Skill>Cooking</Skill>

    -- <Skill>Carpentry</Skill>

    -- <Skill>Hunting</Skill>

    -- </Skills>

    -- </Person>

    -- <Person>

    -- <Name>Sally</Name>

    -- <Age />

    -- <Skills>

    -- <Skill>Cooking</Skill>

    -- <Skill>Carpentry</Skill>

    -- </Skills>

    -- </Person>

    --</Root>

    --'

    --;

    --===== This is an "Hybrid Based" example. Uncomment it to see the code work

    -- with it instead of the other examples.

    SET @pXML = '

    <items>

    <item id="0001" type="Donut">

    <name>Cake</name>

    <ppu>0.55</ppu>

    <batter id="1001">Regular</batter>

    <batter id="1002">Chocolate</batter>

    <batter id="1003">Blueberry</batter>

    <topping id="5001">None</topping>

    <topping id="5002">Glazed</topping>

    <topping id="5005">Sugar</topping>

    <topping id="5006">Sprinkles</topping>

    <topping id="5003">Chocolate</topping>

    <topping id="5004">Maple</topping>

    </item>

    </items>

    '

    ;

    --------------------------------------------------------------------------------

    ;

    /******************************************************************************

    The following code is what I built to automatically flatten non-perfect but

    still "well formed" "data" XML. It should handle "Attribute Based", "Element

    Baseed", and "Hybrid" (combination of the two) XML. It not only automatically

    creates a "flatted" result set from the XML, but it also creates and displays

    the T-SQL that did the job so that you can easily tweak the code to meet exact

    needs (such as changing the data-types or pointing it at a table, etc.

    This does what I thought the whole purpose of what XML was actually was

    supposed to be... the ability to make a flat result set without having to

    manually "discover" and then write code to actually figure out what the XML

    contains.

    As a side bar, this method seems much more forgiving than using an XSD file to

    use the data from the XML. Certainly, it seems easier than using an XSD file.

    I could be wrong, though. This is really my first significant dive into XML.

    Developer's Notes:

    1. This code is "forgiving" in that not every "entity" within the XML needs to

    have all the attributes that might be available throughout the XML. Unlike

    other code of this nature, this code scans ALL of the XML for all entities

    and all attributes. It even allows for duplicate names to occur in the

    attributes (columns).

    2. Despite Note 1 above, the XML falls well short of "artificial intelligence"

    and the XML must still be "well formed". Tags must still have the

    appropriate start and end tag where required, for example. Attributes

    in "Attribute Based" XML must still be properly nested in the "Entity" tag

    and the attribute value assignments must still be properly formed.

    3. I'm from the U.S.A. and I have no current requirements to use NVARCHAR.

    To wit, if you need the ability to use UniCode in your XML, then you'll

    need to modify this code to change all VARCHARs to NVARCHAR (including any

    string constants and variables), CHAR(10) to NCHAR(10, and the number

    "8000" to "4000" (there's only one place where that happens).

    4. Although I believe in the occasional need for case-sensitive columns, I

    don't believe in server default case-sensitive collations. As a result, be

    aware that I made no attempt to make this code case-INsensitive (although I

    believe that the cases match in this code because of the great job someone

    at Microsoft did on sp_xml_preparedocument).

    5. With a bit of forethought, the code could be modified to automatically

    generate individual appropriately named and keyed tables based on the name

    of each "Entity" contained in the XML. I didn't take the time to do that

    for two reasons... 1) Most people are just happy to flatten the XML and

    get the flat result set just by issuing one command and 2) this exercise

    has already turned my brain into mud because I actually had to work with

    XML and I hate XML. I have to wash my mouth out every time I say it. ;-)

    Rev 00 - 14 Jul 2012 - Jeff Moden - Initial creation and unit test.

    ******************************************************************************/

    --=============================================================================

    -- Presets

    --=============================================================================

    --===== Suppress the auto-display of rowcounts for appearances in this case.

    SET NOCOUNT ON;

    --===== Conditionally drop the working tables to make reruns in SSMS easier.

    IF OBJECT_ID('tempdb..#Edge' ,'U') IS NOT NULL DROP TABLE #Edge;

    IF OBJECT_ID('tempdb..#NodeInfo','U') IS NOT NULL DROP TABLE #NodeInfo;

    --=============================================================================

    -- Create a hierarchical table from the XML document which contains ALL of

    -- the meta-data and data about the XML that we need.

    --=============================================================================

    --===== Create an obviously named variable to refer to the "prepared" XML

    -- document.

    DECLARE @DocHandle INT;

    --===== Microsoft says that the following code "Creates an internal

    -- representation of the document". What they don't tell you is that it's

    -- not much more than a simple "Adjacency List" hierarchy table with a

    -- couple of special columns added. Microsoft refers to this hierarchical

    -- table as an "Edge" table. An "edge" is the line draw between two nodes

    -- on something like an Org Chart.

    EXEC sp_xml_preparedocument @DocHandle OUTPUT, @pXML;

    --===== Copy the data from the implicit "edge" table of the XML into a Temp

    -- Table where we can work on it without having to call OPENXML more

    -- than once.

    --

    -- Note that the max width of "column names" (LocalName, etc) is expected

    -- to be no more that 256 characters. You could set these to be much

    -- larger but I'd also expect that means that someone got silly with the

    -- XML itself.

    SELECT ID = ISNULL(ID,0), --ISNULL makes a NOT NULL column so we can PK it.

    ParentID,

    NodeType, --1="Entity/Table", 2="Attribute/Column", 3="Data Node"

    --4= a new node type to handle "Element Based" data.

    ParentLocalName = CAST(NULL AS VARCHAR(256)),

    LocalName = CAST(LocalName AS VARCHAR(256)),

    Text,

    HLevel = CAST(NULL AS INT),

    Width = CAST(NULL AS VARCHAR(10))

    INTO #Edge

    FROM OPENXML (@DocHandle, '/') --This means "Prepare EVERYTHING" in the XML

    ;

    --===== Remove the XML document to save on resources.

    EXEC sp_xml_removedocument @DocHandle;

    --===== This prevents multiple rescans similar to what you would find in a

    -- "Triangular Join" which would make the UPDATE that follows this very

    -- slow. Since it's a Temp Table, we'll let the machine create a unique

    -- name for the constraint because constraints must be uniquely named and

    -- we'll want to use this code concurrently in the future.

    ALTER TABLE #Edge

    ADD PRIMARY KEY CLUSTERED (ID)

    ;

    --=============================================================================

    -- Add extra information to the "edge" table to more easily create other

    -- simpler information later on.

    --=============================================================================

    --===== Build the hierarchical level column, get the name of the "parent" for

    -- each row, and calculate the max width of each "column" of data. Most of

    -- this is necessary to control the order in which we build the T-SQL to

    -- successfully and fully shred the XML.

    WITH

    cteBuildPath AS

    (--==== This is the "anchor" part of the recursive CTE

    SELECT anchor.ID,

    HLevel = 1,

    LocalName,

    ParentLocalName

    FROM #Edge AS anchor

    WHERE ParentID IS NULL

    UNION ALL -------------------------------------------------------------------

    --==== This is the "recursive" part of the CTE that adds 1 for each level

    -- and carries the previous LocalName forward as the ParentLocalName for

    -- the next level of rows.

    SELECT recur.ID,

    HLevel = cte.HLevel+1,

    LocalName = recur.LocalName,

    ParentLocalName = cte.LocalName

    FROM #Edge AS recur

    INNER JOIN cteBuildPath AS cte

    ON cte.ID = recur.ParentID

    )--==== This does the actual update of the "edge" table from above.

    -- It also calculates the column width of the data so we can modify

    -- the resulting T-SQL later to make it so not every column has to

    -- be a TEXT or NVARCHAR(MAX) column.

    UPDATE tgt

    SET tgt.HLevel = bp.HLevel,

    tgt.ParentLocalName = bp.ParentLocalName,

    tgt.Width = DATALENGTH(tgt.Text)

    FROM #Edge tgt

    INNER JOIN cteBuildPath bp

    ON tgt.ID = bp.ID

    ;

    --=============================================================================

    -- Create the "node information table" which will ultimately be used to

    -- build the T-SQL to read and return the XML data as a "flat" result set.

    --=============================================================================

    --===== This isolates "Entity" and "Attribute" node types and provides the

    -- necessary table aliases. It also keeps us from having to do any more

    -- large scans on the temporary edge table, which could actually be

    -- dropped at this time to conserve resources.

    --

    -- Note that the ISNULL gets rid of the Null aggregate WARNING message

    -- and speeds things up a bit.

    SELECT RowNum = IDENTITY(INT,0,1), --Just a "uniquifier" for a PK.

    NodeType, HLevel, ParentLocalName, LocalName,

    DeDupeLocalName = CAST(NULL AS VARCHAR(256)),

    Alias = CASE --These are "derived table" aliases we need in the T-SQL.

    WHEN NodeType = 1

    THEN 'x' + CAST(ROW_NUMBER() OVER

    (ORDER BY NodeType,HLevel) AS VARCHAR(10))

    ELSE ''

    END,

    Width = MAX(ISNULL(Width,0))

    INTO #NodeInfo

    FROM #Edge

    GROUP BY NodeType, HLevel, ParentLocalName, LocalName

    ;

    --===== Add some indexes just in case there are a lot of "columns" to process.

    -- A lower fill factor is used because of the upcoming data mods in the

    -- rest of the code to reduce costly page and extent splits when there

    -- are a lot of "columns". If you regularly deal with such large XML,

    -- you might want to reduce the FILLFACTOR even more. Still, even if you

    -- don't change it at all, this is going to be a whole lot faster than

    -- trying to write code to flatten the XML on your own.

    ALTER TABLE #NodeInfo

    ADD PRIMARY KEY CLUSTERED (RowNum) --Let the system name it on Temp Tables.

    WITH FILLFACTOR = 80

    ;

    CREATE UNIQUE INDEX IX_#NodeInfo_NodeType_RowNum

    ON #NodeInfo (NodeType, RowNum)

    WITH FILLFACTOR = 80

    ;

    --=============================================================================

    -- Build the FROM clause of the T-SQL. We need to do this first because

    -- it works the same way for both "Attribute Based" and "Element Based"

    -- XML. We'll actually need to modify the node data later on to build the

    -- SELECT clause because the two different types of XML have different

    -- requirements when it comes to node types. "Attribute Based" XML has

    -- "Type 2 (column) nodes behind the scenes where "Element Based" XML

    -- does not. "Hybrid Based" will originally look like "Attribute Based".

    --=============================================================================

    --===== This creates the FROM and OUTER APPLYs using the essential aliases for

    -- each "entity" (Type 1 node) in the XML.

    -- This works as is for both attribute based and non-attibute based XML.

    DECLARE @From VARCHAR(MAX);

    SELECT @From = '';

    SELECT @From = @From

    + ISNULL(' OUTER APPLY ' + ParentLocalName, ' FROM @pXML')

    + '.nodes (' + QUOTENAME(LocalName,'''') + ') '

    + Alias + ' '

    + '(' + LocalName + ')' + CHAR(10)

    FROM #NodeInfo

    WHERE NodeType = 1

    ;

    --=============================================================================

    -- If the XML is "Element Based" rather than "Attribute Based", there are

    -- no "Type 2 (column) nodes. In order to make the SELECT clause for both

    -- in a common fashion, we need to modify the node information table so

    -- that it looks like the data is "Attribute Based".

    --=============================================================================

    --===== We need to add aliases to any Type 2 Nodes that don't already have them

    -- so we can correctly build the SELECT list.

    UPDATE tgt

    SET tgt.Alias = src.Alias

    FROM #NodeInfo tgt

    INNER JOIN #NodeInfo src

    ON tgt.ParentLocalName = src.LocalName

    WHERE src.NodeType = 1 --Entity Node

    AND tgt.NodeType = 2 --Column Node

    ;

    --===== If the XML isn't attribute based, there will be no Type 2 nodes.

    -- Make the node info table look like there are except make them a new

    -- (for this code) "Type 4" (Element Based Column). This is done really

    -- easilysimply by seeing if the "LocalName" is present as a Type 3 data

    -- node according to the "ParentLocalName". This is probably the only

    -- place where duplicate Entity or Attribute names might cause a problem.

    UPDATE #NodeInfo

    SET NodeType = 4

    WHERE NodeType = 1

    AND LocalName IN (SELECT ParentLocalName FROM #NodeInfo WHERE NodeType = 3)

    ;

    --===== Now we need to resolve any duplicate column names so that we're not

    -- confused by the output if dupes exist.

    UPDATE #NodeInfo

    SET DeDupeLocalName = ParentLocalName + '_' + LocalName

    WHERE LocalName IN

    (

    SELECT LocalName

    FROM #NodeInfo

    WHERE LocalName <> '#text' --Only data nodes have this LocalName

    GROUP BY LocalName

    HAVING COUNT(*) > 1

    )

    ;

    --=============================================================================

    -- Now that the node information looks like it's "Attribute Based", copy

    -- the "column width" information from the Type 3 (data) nodes to the

    -- Type 2 (column) and Type 4 (element column) nodes to make the final

    -- dynamic SQL for the SELECT clause a whole lot easier to build and

    -- understand.

    --

    -- Notice that we can also easily handle the MAX datatype here because

    -- we stored the WIDTH as a VARCHAR earlier.

    --=============================================================================

    --===== Copy the column Width from the data nodes to the column nodes

    UPDATE tgt

    SET tgt.Width = CASE

    WHEN src.Width <= 8000

    THEN src.Width

    ELSE 'MAX'

    END

    FROM #NodeInfo tgt

    INNER JOIN #NodeInfo src

    ON tgt.LocalName = src.ParentLocalName

    WHERE tgt.NodeType IN (2,4)

    AND src.NodeType = 3

    ;

    --=============================================================================

    -- That's it. Whether the XML is "Attribute Based", "Element Based" or

    -- a "Hybrid" of the two, we now have enough information to build the

    -- SELECT list for both types of XML with some minor tweaks by CASE

    -- statements as to which data to use.

    --

    -- Although ORDER isn't really important here, it will give you better

    -- hints as to what's actually in the XML and the order that it's in for

    -- troubleshooting purposes.

    --=============================================================================

    --===== This creates the SELECT list

    DECLARE @Select VARCHAR(MAX);

    SELECT @Select = ISNULL(@Select + ',' + CHAR(10)+ SPACE(8), '')

    + ISNULL(DeDupeLocalName,LocalName) + ' = '

    + CASE

    WHEN NodeType = 2 --Node Type 4 implied for the ELSE by WHERE

    THEN Alias + '.' + ParentLocalName + '.value '

    ELSE Alias + '.' + LocalName + '.value '

    END

    + CASE

    WHEN NodeType = 2 --Node Type 4 implied for the ELSE by WHERE

    THEN '(' + QUOTENAME('@' + LocalName ,'''')

    + ', ''VARCHAR(' + Width + ')'')'

    ELSE '(''(text())[1]'', ''VARCHAR(' + Width + ')'')'

    END

    FROM #NodeInfo

    WHERE NodeType IN (2,4)

    ORDER BY HLevel, Alias --Simplifies troubleshooting if needed

    ;

    --===== Put the final SQL all together along with some helpful hints for use.

    DECLARE @SQL NVARCHAR(MAX);

    SELECT @SQL = '

    /*=============================================================================

    The following T-SQL is what shredded the XML you gave it.

    If desired, please modify the code to change the datatypes to suit your needs

    and maybe add an INTO #SomeTableName just before the FROM clause to

    "auto-magically build and store the results into a temporary or other table as

    a "staging" table on-the-fly to work from.

    --Jeff Moden

    =============================================================================*/

    '

    + ' SELECT RowNum = ROW_NUMBER() ' --Ensure unique rows

    + 'OVER(ORDER BY (SELECT NULL)),' + CHAR(10) + SPACE(8)

    + @Select + CHAR(10)

    + @From

    + ';' + '

    --=============================================================================

    -- Hint on how to convert the T-SQL to work on a table column.

    --=============================================================================

    /*

    Change the FROM clause in the SQL above in a similar manner to below to shred

    an entire column of similar (ie: same nodes/structure) XML.

    Change from: (Note that "... etc x" means "the rest of the line of existing code.")

    FROM @pXML.nodes (... etc 1

    OUTER APPLY ... etc 2

    OUTER APPLY ... etc 3

    ... etc x

    Change to:

    FROM dbo.YourTableName x0

    CROSS APPLY YourXMLColumnName.nodes( ... etc 1

    OUTER APPLY ... etc 2

    OUTER APPLY ... etc 3

    ... etc x

    */

    '

    ;

    --===== Display the SQL in the Messages tab so we can reuse it or modify it

    -- to become production code later on.

    PRINT @SQL;

    SELECT [NOTICE!!!] = 'Please see "Messages" tab for the rendered T-SQL'

    UNION ALL

    SELECT [NOTICE!!!] = 'that created the "table" below from the XML.'

    ;

    -- SELECT * FROM #Edge

    -- SELECT * FROM #NodeInfo

    --===== Show that the SQL works as advertised and so that we can examine the

    -- results to tweak the generated SQL for datatypes, etc, later.

    EXECUTE sp_executesql @SQL,

    N'@pXML XML',

    @pXML

    ;

    --select * from #Edge

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Well Done Mr. Moden 😎

  • @thanks Jeff, will try this

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply