• There is no question that XML causes the data files to bloat.  On the other hand, it is much less ambiguous than swapping CSV files with knuckleheads who don't know how to deal with qualifying fields, embedded delimiters, columns with multi-line text values, and so on.  Have you ever dealt with someone who thought it was your fault that your code couldn't handle \" instead of "" in a quote-qualified field?

    One application I dealt with that benefited from XML data files was downloading reports (as XML) from a web service, where the report columns were subject to change without notice.  An XML query could return all the field names from the first record, which were used to create a dynamic SQL query to read the columns in that report.  The data columns came from a limited domain of possibilities, so a data dictionary table was used to provide data types for each column.  The biggest argument against CSV in this case was text columns with multi-line values, that were guaranteed to include commas, quotes, tabs, line feeds, and other obnoxious characters.

    In another case I had to replicate tables from an old Progress system to an SQL 2005 server.  The ODBC connection between the servers had issues, querying a large table created so many locks in the Progress database that it would blow up.  But Progress code dumping the table to a text file ran efficiently and without killing the server.  (The reason for not using CSV escapes me at the moment.)  The largest table was too big for either server to digest at once as an XML DOM object, but I could write a Progress program to export the data to a file as a text stream in XML format, and in SSIS I wrote a script task that could read the file token-by-token with the XmlStreamReader class.  I'm not sure I want to write generic input code to read any possible XML file, but I controlled the source file format and thus could simplify the parsing.  There's no rule that putting the entire XML file into memory at once is the only way to deal with it.