XML capacity for large data volumes?

  • In another thread:

    http://www.sqlservercentral.com/Forums/Topic470738-146-1.aspx

    A comment, in reference to a slow export to Excel, was made that said:

    "If you have the option of using XML, you may find less issues later."

    My only attempt to use XML was generating an XML file from a ~62k row Excel file, and the output seemed unusable; as I recall even XML Notepad 2007 choked trying to open it.

    That, combined with the huge overhead on additional characters required to define the data in an XML format, left me with the feeling that XML just wasn't going to hack it for serious (large volume) data transfer.

    But the rest of the world has jumped on the XML bandwagon, so perhaps I ought to reconsider.

    What experiences have others had with XML?

  • Funny - I made the post you are referring to.

    I was mostly making the comment in reference to Excel. Before Excel 2007, there was a limit of 65000 rows in a worksheet so you could not generate a spreadsheet with more data than that. Excel 2007 actually stores it's data in zipped XML files (rename an .xlsx file to a .zip file and open it to see what I mean). This eliminated many restrictions, but SSIS will consume an XML document faster than an excel spreadsheet with the same information and the excel document will usually be larger because of the additional information to describe workbook properties.

    As far as data transfer, prior to XML the generally accepted options were delimited and fixed-width files. Both of these are generally lighter than XML (tab separators vs XML tags is much smaller) but it was very limited as a generic way of describing the data. You really had to have a custom application specifically designed to consume the information.

    The big advantage of XML is that the description of the data can be included with the data allowing very generic applications to consume it rather easily. In addition to that, it is rich enough to describe relationships within the information as well (child nodes) which is something that generic delimited files really do not do well.

  • Yup, that was your comment -- I was hoping that you might see the new thread and reply. (I didn't want to take the original thread so far off topic.)

    So, I'll assume from your reply that SSIS can process XML faster than XML Edit. I'll run some tests to learn more.

    Thanks!

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply