Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««1234»»

What is XML? Expand / Collapse
Author
Message
Posted Monday, July 24, 2006 12:20 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, January 12, 2012 1:19 PM
Points: 35, Visits: 4

>>I have not see any developer using DTD to handle dynamically different document formats.

Just beause you haven't seen it doesn't mean it's not good or it's not being used by others. In fact I have used it in a way that makes maintenance of my applications a lot easier than before. I process data from multiple vendors and they of course change their formats frequently. Having code that will dynamically adjust to a change in the XML schema has been great for me. But of course, beauty is in the eye of the beholder, so I'm not claiming everyone should be doing that. In the end XML is one more tool that happens to work for many of us.

Post #296857
Posted Monday, July 24, 2006 12:45 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 2:47 PM
Points: 132, Visits: 114

"Compare that with the XML solution: once you agree on a particular schema (including a DTD), all the grunt work is already coded in most XML frameworks, you only worry about processing the data."

If you agree on a data model, that's 95% of the effort . Sam, may I ask, what are you using XML languages for? How complicated are your data models? There's no question that with enough effort, you can get XML to work; it's just, can you get it to work well?

BTW, how in the world do you do dynamic data models in XML? Inquiring minds wanna know!

Post #296863
Posted Monday, July 24, 2006 12:49 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, February 11, 2009 3:55 PM
Points: 6, Visits: 3
I am underwhelmed by this article, and not just because it's dated. (DTDs?) Most of the problems the author attributes to XML are in fact problems associated with the application domain that he's trying to apply XML to. It's hard to get the pharmaceutical industry to define and adopt a common data model? That has *nothing to do* with XML.

Similarly, the complaint that XML isn't inherently readable, and that you have to write XSLT to convert it to HTML for presentation, is muddle-headed. What format for representing structured data in a transportable form *would* be human-readable? CSV? JSON?

The complaint that XML is hierarchical is also bogus. It's perfectly straightforward to represent data relationally in an XML document, if you have to. It's just that for most uses of XML, where an XML document is a concise and transportable bag of information that can be parsed into an object model, representing relationships hierarchically is easier. Since object models represent hierarchies, and they represent them because the hierarchies actually exist in the real world that the objects are modeling, a far more apposite complaint would be that relational databases do a poor job of modeling hierarchies. If you're going to get into this at all, which you shouldn't.

This article is primarily complaining about the hardness of certain problems and then castigating XML for not magically solving them.

A *good* article about XML would have addressed the simple question that kagemaru asked:

> In most of cases, they will state together what will be the
> XML structure to have their program running well : So, what
> is the difference with a csv file ?

Well, what *are* the differences between representing data in XML instead of CSV? Here's a couple off the top of my head:

- XML provides a good way of representing hierarchies. CSV doesn't.
- XML documents can be reliably parsed into an object model that can be programmatically manipulated; this object model is implemented in a standard way on many different platforms. Not so for CSV.
- A powerful (if daunting to learn) platform-independent language for searching through XML-based object models exists. Not so for CSV.
- Standard, platform-independent tools for defining the format and organization of XML doucuments and validating that a given document contains what it's supposed to contain exist. Not so for CSV.
- Standard, platform-independent (are you detecting a theme?) tools for transforming data in XML documents into other formats (in particular, into HTML, for presentation) exist. Not so for CSV.
- There's a consistent way of programmatically verifying that the data in an XML document hasn't been corrupted or truncated in transit. Not so for CSV.
- XML documents tell you their character encoding. CSV documents don't.

Now XML has plenty of drawbacks. Above all, it's not terse. This drives people crazy: all those redundant closing tags and repeated attribute tags. That's part of the cost that you pay for representing your data in a transportable form that can be programmatically validated and transformed by platform-independent tools. If you don't need those things, you shouldn't be using XML.

Also, namespaces and CDATA are hard to understand. I'd contend that if you're using XML and you don't understand namespaces, you probably don't really understand what XML is yet and you probably shouldn't be using it.

Finally, since any clown can write an XML document, lots of clowns do. It's a general-purpose format for representing data, and as such, it lets you make all kinds of stupid mistakes if you don't know what you're doing. Pick up just about any "introducing XML" book and you'll find an inadvertent compendium of worst practices.

XML is a technology that addresses a lot of hard problems better than any other technology we have. If you don't need those problems solved (as the author of this article clearly doesn't think he does), you shouldn't be using XML.

Robert Rossney
rbr@well.com
Post #296867
Posted Monday, July 24, 2006 1:51 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 2:47 PM
Points: 132, Visits: 114

"This article is primarily complaining about the hardness of certain problems and then castigating XML for not magically solving them."

That was precisely the point of the article. XML is (often) sold as magic, when it is anything but. I go into a few of the problems with XML in the article, but it is mostly a rant against the hype.

I wrote (in 1999):

"Data transfer standards are fine, and XML is as good a choice as any. Just don't think that it won't take its pound of flesh, just like every other technology known. My guess is that it will be one of those technologies of the future that always remain so."

I leave it to you whether I was a prophet .

Post #296882
Posted Monday, July 24, 2006 2:49 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, January 12, 2012 1:19 PM
Points: 35, Visits: 4

>> If you agree on a data model, that's 95% of the effort.

Actually, that statement is true when the data model agreed upon is XML, because once that's done, the data processing is very simple. In some cases though, if the data model is not XML, the complexitiy of processing the data may take a big chunk out of the developing time. For an extreme example, think about a Java shop receiving Excel files from a Microsoft shop. Regardless of this, however, my main point is that by adopting XML as a standard, everybody concentrates on the data itself and the processing is left to the already highly optimized (and free) tools that are available in all platforms. It's a win win situation.

>>Sam, may I ask, what are you using XML languages for? How complicated are your data models?

I use XML to process information we receive from several vendors on health care providers and fees. The data models are very simple, nothing fancy here, but we were having a hell of a hard time trying to adapt to all the different formats the vendors were using (csv, excel, access, fixed width text files, etc). We convinced most of the vendors (not all, it never works that way) to switch to XML and it's been great for the most part: Our code can detect if a new field was added or deleted from the schema. If deleted, it will be ignored, if added, a new column willl be added to the appropriate table with the appropriate data type, and then the file will be imported. Of course, we know that the changes to the schema are never too large, it's usually a couple of fields added or deleted, I'm not claiming this is something that can be used to handle ANY schema regardless (it would result in a rather ugly looking database table). Most of the effort goes into changing existing stored procs and reports to add the new column where appropriate (or delete it if necessary). But the main point is that we seldom have to deal with the data directly. Before XML, changes to a file's layout usually required recoding of our applications AND our stored procs/reports.

>>BTW, how in the world do you do dynamic data models in XML? Inquiring minds wanna know!

I don't know what you mean exactly by a dynamic data model, but we are not doing anything fancy. The files we get come with their own schemas, but we first validate agains the old schema. If the validation succeeds, we move to the next stage to process the data, if not we determine the differences by comparing the new and old schemas and proceed as explained above. If this sound too simple, it's because it really is, and that's exactly my point, by using XML this type of tasks have become rather mundane.

Not to say that everything XML is simple or that everything non-XML is more complicated than it should be, but there is definitely a reason for all the XML hype. 

Post #296894
Posted Monday, July 24, 2006 3:14 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 2:47 PM
Points: 132, Visits: 114

"Actually, that statement is true when the data model agreed upon is XML, because once that's done, the data processing is very simple."

Data models are independent of their representation. When you create an XML language (remember, XML itself is nothing), then you have access to parsing tools and format conversion tools. That's not nothing, but nothing to write home about, IMHO.

"In some cases though, if the data model is not XML, the complexitiy of processing the data may take a big chunk out of the developing time. For an extreme example, think about a Java shop receiving Excel files from a Microsoft shop. Regardless of this, however, my main point is that by adopting XML as a standard, everybody concentrates on the data itself and the processing is left to the already highly optimized (and free) tools that are available in all platforms. It's a win win situation."

Let me give you an XML<=>XML nightmare. Let's say you need to convert Tbook to Docbook. You can write XSL to do that, but it would be an onerous task. The fact that both are XML isn't much help.

Unless, of course, someone has already built it for you.

"I use XML to process information we receive from several vendors on health care providers and fees. The data models are very simple, nothing fancy here, but we were having a hell of a hard time trying to adapt to all the different formats the vendors were using (csv, excel, access, fixed width text files, etc). We convinced most of the vendors (not all, it never works that way) to switch to XML and it's been great for the most part: Our code can detect if a new field was added or deleted from the schema. If deleted, it will be ignored, if added, a new column willl be added to the appropriate table with the appropriate data type, and then the file will be imported. Of course, we know that the changes to the schema are never too large, it's usually a couple of fields added or deleted, I'm not claiming this is something that can be used to handle ANY schema regardless (it would result in a rather ugly looking database table). Most of the effort goes into changing existing stored procs and reports to add the new column where appropriate (or delete it if necessary). But the main point is that we seldom have to deal with the data directly. Before XML, changes to a file's layout usually required recoding of our applications AND our stored procs/reports."

From what I gather from your response, you do have to do a lot of recoding when the schema changes. You've standardized on a data transfer format, which is good, but you still have lots of work to do. Using XML saved you some, I guess.

"I don't know what you mean exactly by a dynamic data model, but we are not doing anything fancy. The files we get come with their own schemas, but we first validate agains the old schema. If the validation succeeds, we move to the next stage to process the data, if not we determine the differences by comparing the new and old schemas and proceed as explained above. If this sound too simple, it's because it really is, and that's exactly my point, by using XML this type of tasks have become rather mundane."

I did something similar with SAS Version 5 XPT format datasets. The absolute worst data format in the world. However, there were many parsing and conversion tools already written, which also made the conversion/parsing tasks trivial.

In other words, you don't need XML to accomplish that. And XML costs alot.

Post #296899
Posted Monday, July 24, 2006 3:28 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Today @ 9:02 AM
Points: 2,915, Visits: 1,849
Consuming XML isn't hard, it's digesting it and it makes my eyes water to parse it.

I've used it for what it is supposed to be used for. Separating content from presentation.

When it is used for what it was originally intended it works.

I'm just waiting for a gently lobbed D.C.Peterson handgrenade.


LinkedIn Profile
Newbie on www.simple-talk.com
Post #296900
Posted Monday, July 24, 2006 4:05 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, January 12, 2012 1:19 PM
Points: 35, Visits: 4

"Let me give you an XML<=>XML nightmare. Let's say you need to convert Tbook to Docbook. You can write XSL to do that, but it would be an onerous task. The fact that both are XML isn't much help.Unless, of course, someone has already built it for you."

What's your point? Name ANY technology, past present or even future, and I will show you a situation where using that technology would be an absolute nightmare. Of course there are situations where XML is not appropriate. But there are many where it's just what the doctor ordered.

"From what I gather from your response, you do have to do a lot of recoding when the schema changes. You've standardized on a data transfer format, which is good, but you still have lots of work to do. Using XML saved you some, I guess."

And that's exactly my point. The amount of recoding was reduced substantially by moving to XML.

"I did something similar with SAS Version 5 XPT format datasets. The absolute worst data format in the world. However, there were many parsing and conversion tools already written, which also made the conversion/parsing tasks trivial. In other words, you don't need XML to accomplish that."

Nobody is claiming that you need XML for EVERYTHING. But you must admit that for a large number of tasks XML makes life easier for everybody. The same way you used existing tools to make the conversion/parsing trivial, a lot of people are using xml tools in a way that makes their programs a lot easier than before, sometimes trivial. You are ignoring the great advantage that comes from adopting a format that's shared by many. Could it be better? maybe, but that's not the point, the fact is: XML has been adopted by many and it works, so why not use it? Do you have an alternative? What do you suggest we use to exchange information? Say you came up with a clever data model and an amazingly powerful compression algorithm to send data over the net. Who would consume your data? You would have to provide API's for different platforms in order to enable others to manipulate your new format programatically. And of course, you would have to convince everybody that your way is "better", existing programs would have to be modified, there would be a need for specialized tools to handle special cases, etc, etc. That was the situation when XML came out, and it won hands down over other alternatives, why do you think that happened? Because the majority of programmers saw the utility of the new format and adopted it. Until someone comes up with a better way (and someone will, for sure), XML is here to stay.

"And XML costs alot."

I don't know what you mean. Does it cost a lot in terms of development time? bandwith? learning curve? For each of these, the answer is, as usual: depends on what you want to achieve. It's the same with hardware: A $500 graphics card by itself is neither cheap nor expensive, it depends on what you want it for: if it's for your grandma's computer so she can use aol, you are paying too much, but if its for your latest and greatest gaming rig, it may be the best part of your system.

Post #296908
Posted Monday, July 24, 2006 5:01 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 2:47 PM
Points: 132, Visits: 114

Just remember: XML is not a format. It is a format format, akin to a cookie cutter. Cookie cutters usually don't taste very good .

I'd say that the costs of XML languages come in two places: one, lots of people, myself included, find it incredibly difficult to debug, particularly when the data model contains several layers of hierarchy. Two, it is a space hog. In an app I designed, I found that there was a 43x increase in data transmission size over a CSV equivalent file. That killed it right there.

That said, if you've got it to work well, then Mazel Tov! Again, the article was mostly a rant against hype. The kind of hype that gets the management types to insist on requiring XML without a clue as to what they are going to do with it once they've got it.

Post #296924
Posted Tuesday, July 25, 2006 10:47 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, August 12, 2011 4:21 PM
Points: 2, Visits: 7
Sam C,

Sounds like XML would be helpful to me--I also process data on health care providers and fees from several sources. What are the tools or frameworks I should become familiar with? (I use SQL Server and Powerbuilder, but could use scripting languages, too).
Post #297162
« Prev Topic | Next Topic »

Add to briefcase «««1234»»

Permissions Expand / Collapse