Representing XML data in SQL Server

  • Eirikur Eiriksson

    SSC Guru

    Points: 182440

    Unfortunately Jeff, I don't have the system I used for this anymore, refurbished it and donated to someone that didn't have a laptop.

    😎

    The gist of it is that I had some relational data (4NF) exported to XML files, which I then checked for the actual data length. Then, in a vanilla database, I imported the files into clustered index tables as XML data type, using small int as the clustered key (minimizing the impact). Then in a separate vanilla database, I imported the data from the SQL XML storage into the same schema as the original data originated from. Both SQL databases had autogrowth set to a minimal size and the storage quantification was based on the file size deltas, before and after the data import. I know it's not 100%, but I think its close enough.

  • aaron.reese

    SSChampion

    Points: 13415

    #tagbloat

    My Favourite was an XML file I worked with in the health service.  <PersonGenderCurrent>1</PersonGenderCurrent>

    FWIW I still like XML.  Yes you need a decent viewer but XML parsers have been built into browsers for many years, heck even IE10 has a decent XML viewer and now that we have VS code, life is good in the XML world.  Personally I find properly structured XML very easy to read and understand.  Related data (1:n) is included in a pluralised tag to indicate it is a collection then then every instance (row in the tuple) is enclosed in a singular tag.  This is the same logical structure you should be using in JSON but I find it infinitely more readable.

     

  • tbailey 19088

    SSCrazy

    Points: 2072

    This is a nicely written article. Perhaps more significant than representing xml in SQL Server is generating xml from SQL Server for export. Bloviated opinions about the drawbacks of xml aside, the ability to smoothly generate xml from a database is an absolute requirement for b2b and business-to-government communications. In most of these cases, the destination entity has extremely explicit requirements, including their xml schema, for producing structure hierarchical data. A post about xml schema collections and typed xml would be a valuable addition.

    Only FOR XML PATH and For XML EXPLICIT queries have the flexibility to meet these requirements. FOR XML PATH queries are much easier to maintain.  Xml generation is the glaring exception to the general wisdom about avoiding scalar UDFs and nesting of scalar UDFs. UDFs returning xml(<type>) from appropriate keys are a great way to generate complex nodes Nesting scalar UDFs returning xml nodes works well to produce complex xml nodes which contain complex child nodes.

    Thanks diogosouzac!

    Tim Bailey

  • jonathan.crawford

    SSCertifiable

    Points: 6577

    Aaron N. Cutshall wrote:

    Jeff's experience not withstanding, I've actually had good experiences with XML. But then, I was the one who established the format and oversaw its implementation and use. I have seen XML horribly used and most default XML exports are terribly bloated because they don't utilize attributes and treat everything as an element. Situations like this are why XML gets such a bad reputation. It's also why JSON gets trumpeted as a "superior" format. However, while a well-crafted XML format can be validated (through an XML schema definition or XSD) for structure and content, JSON cannot. Yet, the very things that give XML its great flexibility are generally why it gets abused, misused, and horribly twisted. I have seen some XML scenarios that just make me cringe (HL7 v3 is a prime example) due to over-complexity and generally poor design.

    So much this. Validate XML? Why on earth would we do that? That would mean we might have to *learn* something. SGML/XML are great ideas, it's a shame that in practice the files end up referencing invalid formats, badly-thought-out formats, storing things as text, using external scripts all over within the page, which then renders the benefit of the tags somewhat lacking.

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

  • aaron.reese

    SSChampion

    Points: 13415

    I agree.  So many times I have had to either read or create XML files that are effectively flat data files without proper collections or hierarchy tags and I think - Why?  Why have you just made life so hard for me, for you and for your future self by not building it properly in the first place.

  • Jeff Moden

    SSC Guru

    Points: 996831

    Heh... I've got just one more thing to add when it comes to the transfer of large volumes of properly formed tabular data...

    Make Ascii Great Again

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply