Mr. Peterson, you are obviously an intelligent and well spoken individual who is very familiar with the topic of data management. I agree with many of the points that you made in your article. However, I feel that you elected to reiterate the weakest aspects XML implementation, without reiterating the strength of XML when used appropriately. As such, your article is unbalanced and misrepresents the value of XML, both in general, and as it is used within a database.
First off, thank you for you kind words.
The whole point of the article is to show how the supposed strengths of XML are not significant when compared to its problems. Yes you can use XML to solve some problems, but there are pre-existing, and more efficient means in every case I can think of. In several years of discussing the subject, I have not had a good example provided where the same task could not be performed more efficiently by using some other means.
Consider the difficulty of replacing an entrenched legacy application within a business. Many applications of "tentacles" that reach into dozens of other places. This makes them hard to remove and replace.
XML, as a data communication mechanism, has allowed me to work with "uncertainty." I can create a set of interfaces between existing systems. I can send more complete data than is currently required, using XML. (XML simply ignores extra data). In this way, I can create one interface that will work for both an OLD system and a NEW one. Note: I may not know what the new system is, or what it requires. Using XML, the BUSINESS can describe the transactions, and I can implement them.
Here you are talking strictly of data transport. I do acknowledge that XML can be of some use here, however the same thing can be done using ANY agreed on data file format. Unfortunately for XML almost any other physical file format will be more efficient... The business will not be describing the transactions using XML, you do that. No matter the method of transport there must be agreement on the PRECISE meaning of the data being sent and recieved. XML tags are not a sufficient description in and of themselves.
Once I've reworked the integration points, I can replace a legacy system with a new one, and NONE of the integrating partners care. THIS IS A MAJOR BONUS. Your article missed it completely.
This is nothing more than providing a layer of abstraction between systems, and again it can be done in a number of more efficient ways than to use XML.
As for storing XML in a database... have you ever seen a database that stored long text strings? Say quotes from authors, or forum messages? The database is not required to maintain the meaning of these text strings. The database maintains the relationships between these strings and the data that surrounds them (who wrote it, dates, sources, etc).
XML, stored in a database, can be used in the same way, if the same conditions apply. XML provides a good way to store data as a "document" that is self consistent. This document can have a meaning to some other system component. However, there is no REQUIREMENT that this meaning must be enforced in the database that holds it.
At the risk of being redundant, just because XML is structured and has meaning, that doesn't mean that it MUST have meaning to the database.
In this case, it makes good sense to use the database to store XML strings. If there are values that need to be mined out of the XML string to make it useful, they should be stored and managed in relational columns, as you'd expect (and, I suspect, defend). Yes, you'd be simply "storing" the XML in the db, and not managing it. So what? Do you "manage" the text in author's quotes?
In conclusion: there are specific times when it is OK to store XML in a database. Also, XML is immediately and intensely useful in application integration, far more so that the CSV example you relied upon.
Of course I have seen large character strings stored in a database. I am usually very suspicious of them and do not allow them without GOOD reason. There are good reasons to allow them in some systems. And yes, there are times when a given kind of "document" is actually an attribute of some entity. In those cases the DBMS must assume that the document is internally consistent. However, XML is by definition not just another text document. It has entities and attributes which are likely important to the business (or else why bother?) If there are distinct entities and attributes (to speak very loosely) to be stored in the database then at a minimum I want datatype constraints enforced. This is impossible with raw XML.
Perhaps the worst aspect of XML is its heirarchical nature. Hierarchical data structures are inflexible and inefficient for general data management and storage purposes. The other objection I have with storing XML in the database is that it takes up too much space.
As for the tone of your article, I must ask you to consider something. I trust your apparent knowledge on data management. Please trust me on application integration. With all due respect, if you blast those who have not learned data management principles, realize that you are perilously close to falling into the camp of those who have a poor understanding of enterprise application integration and system communication.
Please, take a deep breath. If top tier data analysts like yourself and solution designers (like myself and others) work together, and use the appropriate tools for the appropriate tasks, we can create good designs that solve current and future needs efficiently.
With utmost respect,
I am familiar with enterprise application integration having come from that background. I know that integration efforts are non-trivial. However, I do not believe that there is room to take shortcuts when it comes to data integrity.
I fully agree that data management professionals and application developers must cooperate to solve problems, but when it comes to storing XML in the databases I manage, the answer is NO! If you need a place to store XML use a file. If however you want a place to properly manage data, use a properly designed database.
If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek