RE: What is XML? – SQLServerCentral

SSCommitted

Points: 1822

July 24, 2006 at 3:14 pm

"Actually, that statement is true when the data model agreed upon is XML, because once that's done, the data processing is very simple."

Data models are independent of their representation. When you create an XML language (remember, XML itself is nothing), then you have access to parsing tools and format conversion tools. That's not nothing, but nothing to write home about, IMHO.

"In some cases though, if the data model is not XML, the complexitiy of processing the data may take a big chunk out of the developing time. For an extreme example, think about a Java shop receiving Excel files from a Microsoft shop. Regardless of this, however, my main point is that by adopting XML as a standard, everybody concentrates on the data itself and the processing is left to the already highly optimized (and free) tools that are available in all platforms. It's a win win situation."

Let me give you an XML<=>XML nightmare. Let's say you need to convert Tbook to Docbook. You can write XSL to do that, but it would be an onerous task. The fact that both are XML isn't much help.

Unless, of course, someone has already built it for you.

"I use XML to process information we receive from several vendors on health care providers and fees. The data models are very simple, nothing fancy here, but we were having a hell of a hard time trying to adapt to all the different formats the vendors were using (csv, excel, access, fixed width text files, etc). We convinced most of the vendors (not all, it never works that way) to switch to XML and it's been great for the most part: Our code can detect if a new field was added or deleted from the schema. If deleted, it will be ignored, if added, a new column willl be added to the appropriate table with the appropriate data type, and then the file will be imported. Of course, we know that the changes to the schema are never too large, it's usually a couple of fields added or deleted, I'm not claiming this is something that can be used to handle ANY schema regardless (it would result in a rather ugly looking database table). Most of the effort goes into changing existing stored procs and reports to add the new column where appropriate (or delete it if necessary). But the main point is that we seldom have to deal with the data directly. Before XML, changes to a file's layout usually required recoding of our applications AND our stored procs/reports."

From what I gather from your response, you do have to do a lot of recoding when the schema changes. You've standardized on a data transfer format, which is good, but you still have lots of work to do. Using XML saved you some, I guess.

"I don't know what you mean exactly by a dynamic data model, but we are not doing anything fancy. The files we get come with their own schemas, but we first validate agains the old schema. If the validation succeeds, we move to the next stage to process the data, if not we determine the differences by comparing the new and old schemas and proceed as explained above. If this sound too simple, it's because it really is, and that's exactly my point, by using XML this type of tasks have become rather mundane."

I did something similar with SAS Version 5 XPT format datasets. The absolute worst data format in the world. However, there were many parsing and conversion tools already written, which also made the conversion/parsing tasks trivial.

In other words, you don't need XML to accomplish that. And XML costs alot.