Is XML the Answer?

  • quote:


    I agree with the article. I have to ask the possibly naive question: why can't you do all that you can do with XML using a more efficient file format?

    I also have to ask: if vendors were to put all of the time and effort into building standards for transforming other file formats, say CSV, and developing frameworks of code for reading other more efficient file formats, parsing them, treating them like data sets, etc. as they've done with XML, essentially creating the same base of standards and support as they've done for XML, then what would be the advantage of using XML over those more efficient file formats?

    Why haven't vendors spent more time and energy in developing and supporting more efficient file formats?


    My cynical side says it's because those more efficient file formats already exist and there is no money to be made by promoting them.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • quote:


    quote:


    I agree with the article. I have to ask the possibly naive question: why can't you do all that you can do with XML using a more efficient file format?

    I also have to ask: if vendors were to put all of the time and effort into building standards for transforming other file formats, say CSV, and developing frameworks of code for reading other more efficient file formats, parsing them, treating them like data sets, etc. as they've done with XML, essentially creating the same base of standards and support as they've done for XML, then what would be the advantage of using XML over those more efficient file formats?

    Why haven't vendors spent more time and energy in developing and supporting more efficient file formats?


    My cynical side says it's because those more efficient file formats already exist and there is no money to be made by promoting them.


    You may be right. Businesses do exist to make money after all. It seems to be that many of the supposed benefits of XML have nothing to do with XML per se; they are benefits that result from the infrastructure that vendors have built around XML. It's no easier to type a raw XML file than it is to type a raw CSV file; in fact, there are probably fewer key strokes for the latter; and if I have a header for my CSV file it's as simple for an application to determine the meaning of any specific field as it is in XML, and my applications - written correctly - could adjust to changes in the ordering of fields in a CSV file just as they can in an XML file. What makes XML easy is that vendors have developed interfaces for XML files that make it easy to transform them, parse them, and interact with them; they could have done the same with CSV, though, had they wanted to. They could have specified a structure for defining a CSV file header, a language for translating a CSV file from one format to another, and interfaces to parse CSV files, and it would then have been as easy to work with them as it is to work with XML. If vendors had spent the time to build this kind of infrastructure around other file formats, more efficient file formats, we could derive the same benefits from those file formats that can be derived from XML, and we'd have a more efficient file format to boot.

    As it is, vendors have clamped on to XML, sold it to us an answer to all of our troubles, given us tools to make it easy, and everyone has jumped on board; ease-of-use is a powerful tool when it comes to popularizing a technology.

  • When I was listening to a good XML lecture by a really knowlegable XML guru, there are some things I remembered that are seldomly emphasized:

    XML = DOM (Document Object Model)

    XML is NOT <tags> (tags are only textual represenation of XML, you can serialize it as binary format and not wory about the size, the same goes for SOAP)

    XML document without schema can be just well formed XML, it can in no way be valid (contain data that you would put into database)

    As long as you map data structure from XML file to DOM, validate it with its schema and map it to your well developed database, you are on the safe side.

    I agree that developers using database to store XML douments as text have missed the point, but in some case that may be necessary (forget about using that data in any other way than text storage - arhive maybe?).

    Developers writing "optimized" XML parsers are on the wrong track too, because I doubt they are able to fully support latest standards. XML parser should be a platform service that developers use to (de-)serialize XML (in-)out of DOM.

    Just my tiny contribution.

  • quote:


    quote:


    When I was listening to a good XML lecture by a really knowlegable XML guru, there are some things I remembered that are seldomly emphasized:

    XML = DOM (Document Object Model)

    XML is NOT <tags> (tags are only textual represenation of XML, you can serialize it as binary format and not wory about the size, the same goes for SOAP)

    XML document without schema can be just well formed XML, it can in no way be valid (contain data that you would put into database)

    As long as you map data structure from XML file to DOM, validate it with its schema and map it to your well developed database, you are on the safe side.

    I agree that developers using database to store XML douments as text have missed the point, but in some case that may be necessary (forget about using that data in any other way than text storage - arhive maybe?).

    Developers writing "optimized" XML parsers are on the wrong track too, because I doubt they are able to fully support latest standards. XML parser should be a platform service that developers use to (de-)serialize XML (in-)out of DOM.

    Just my tiny contribution.


    I think the XML guru overstated his point. I wouldn't say that XML = DOM, but the two are equivalent. XML is a language for describing documents that conform to the DOM, and a key part of that language is the tags. The definition of a well-formed XML document refers to the tags, e.g. every opening tag must have a matching closing tag, there most be only one root tag (node), tags must embed in certain ways, etc., and hence the language of the file, the tags, accurately define a document structured according to the DOM. XML is not entirely tags, but tags are a fundamental piece of XML, and if you remove them, you do not have XML. It is these tags that many refer to when they state that XML is readable; the tags make it easy to understand what any particular node of the document represents. This agrees with the point that XML is a representation of a document that conforms to the DOM. Remove these tags, and you will definitely have a more efficient format for storage and data transfer, and you may even have a document that still conforms to the DOM; you also will no longer have XML - you will have some other language for expressing a document stored according to the DOM. Store the binary representation of a DOM document, and again it will be more efficiently stored and transfered, and you will be able to represent it as XML, if you so choose; but, this is not what is happening. Rather, DOM documents are being represented as XML and that representation is what is being stored and transfered; that representation is terribly inefficient. Using binary representations of a DOM is a step in the right direction, as far as efficiency, but I think there would still be many who would point out that the DOM is a specialized variation of the hierarchical model, and adopts the weaknesses of that model, continuing to lack a firm foundation in math or logic.

    Edited by - mdburr on 10/22/2003 12:50:51 PM


  • I have read similar articles in the past I agree with the efficiencies issues with XML, and the data modeling points. He is missing one strong point the XML has in its favour. People can agree on it as the lowest common standard. Unlike years of other technologies (RPC, COM, CORBA, EJB ...) which are incompatible with one another. In my opinion, XML should be used as IPC (Inter process communication) at the public enterprise level.

  • I think this is about realizing when a tool/product just can't do everything you're using it for.

    It's time to use common sense, and find that balance of what's the best answer, with what answer pleases us, or our end-users...

    BTW I just blogged about this story at http://cfpurists.blogspot.com/2004/10/is-xml-answer.html

  • Pretty simple in my mind - Never store XML in a database.  Use it to transport query results to folks who want it that way.  Period.  If you have control of both ends of a data transport system use something else, like true recordsets.

     


    Student of SQL and Golf, Master of Neither

  • I think the author should define the question before he writes what the answer is. What I get from experience and from the read of the article and comments is that Is XML the Answer, and the answer is yes and no.

    I have worked on a web service node that queries an SQL Database builds XML and makes that XML available to whoever the caller is. We have control of the users who can access the node due to authenticate processes but do not control any of the technology, environment etc that the caller uses to get there or to use the result. With XML we are able to define the data structure far beyond a red shirt large size. We can include repeating groups and other constructs in the schema that allows us to meet varied user needs without meeting various demands for metadata and special handeling.

    I think part of the problem is that the author sat at a conference where 'XML experts' were not at the table. In the years I have been in this business I have heard experts that should have been on the team setting up the tables instead of sitting at them. The folks you heard were not experts, experts know that XML has more value, use, and a future. But it is not the answer to everything. It has its place.

    Not all gray hairs are Dinosaurs!

  • You misunderstand, I am emphatically stating that XML is NOT the answer to anything.  At its best XML is a bloated and inefficient method of data exchange.  I am aware that it has become a "standard" of sorts and as such does convey some small benefit.  Those benefits have NOTHING to do with XML per se, and are solely due to the widespread adoption of XML.  But, it is a stupid standard, thought up by those who are completely ignorant of data and data management fundamentals.  Now we have everybody jumping on the XML bandwagon, using it everywhere as if those three letters contain some sort of magic that fixes all your data integration problems. WRONG!  They, if anything will add to your problems by imposing additional overhead on your storage, computing, and networking systems.

    The biggest problem with data integration is poor database design.  Vendors and individuals are too often willing to design databases that are nothing more than data persistence mechanisims for their applications.  They don't properly define and model the meaning of the data so when it comes time to integrate systems, you are faced with the problem of decoding two "black boxes" and making them talk to each other.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • Amen, DP.  Sorry I missed the original printing.  I think this is still a great article, glad to see there are people like you willing to publish potentially unpopular view like this. 

  • I loved the article! I don't completely agree with it as I think XML has it's place although I think it is very limited. Like all tools if they are used in the wrong way it's rediculous.

    I like to think of it like this: Using technology for technologies sake gives IT professionals a bad reputation. Every problem has a solution that fits, one solution does not fit all problems and your deep understanding of your unique business systems and the available technology is the only way you can best make that choice. Recently I posted this to a client who was enquiring on our progress:

    "Even though we have a great product (and we’re making it better) we have to work very hard to sell it as people (and I suspect you know) are very quick to dismiss the theory until someone with a professional approach and the technical knowledge to understand the theory evaluates the software. I think it’s a direct result of there being too many amateurs in our industry (which also has it’s opportunities) not knowing what they are talking about although they fully believe they do. Using scientific methods of controlled experiments to gather appropriate data is not a strong point of the average IT person – it seems they are willing to believe what they are told by those with the biggest voices and unfortunately we aren’t one. I still hope that the longer we are here, the professionals within our industry will come to learn of us and will validate our approach with enthusiasm thus giving us the voice we desire and need."

    So many "professionals" willing to follow blindly the advice of those who are themselves ignorant or who have an overiding financial interest in what they are saying disregarding the wants and needs of those they are advising.

    Thank you Don Peterson for your refreshing view and making me believe that not everyone is blind.

     

     

    regards,

    Mark Baekdal

    http://www.dbghost.com

    http://www.innovartis.co.uk

    +44 (0)208 241 1762

    Build, Comparison and Synchronization from Source Control = Database change management for SQL Server

     

     

     

  • Loved the article

    My dealings with xml have been a couple of friends companies (including one very large one) converting all their systems, at a large cost, over to be xml based. Then converting everything back again, at a large cost, to their old bespoke way of doing it due to the huge performance etc. difference.

    And Ive always hated xml when ive been writing software, I normally end up writing my own code to wrap the data im processing (ends up working much faster for me then, and I find more simply!) rather than the xml solutions that are lying around or built into the product im using.

    martin

  • As far as I am concerned XML was supposed to be something that separated out content from presentation for the web. It was a development of SGML.

    I agree with the article. You can tranport data using it, but would you want to?

    Being able to eye-ball your date in XML strikes me as the modern equivalent of being of being able to read a punch card.

  • The same manner of argument was made in the 70's concerning two digits: "19".  Y2K has come and gone and everyone sees the benifit... job security.

    Bandwidth will get cheaper as will storage space. A good analyst would try to make thier jobs easier, more readable.  An excellent analyst would try to make everyone's job easier, not just thier own.


    Keep on truckin'
    Steel Rope

  • XML is the answer if the question is "What comes after 'WML'?  What ever happened to the NSF format?  Oh, yeah, it was replaced by ANSI X.12.  But where is ANSI X.12?  Oh, yeah, it was replaced by XML.  Obviously, there is no such thing as a standard and XML will shortly follow its brethren on the heap of unfulfilled standards...

Viewing 15 posts - 31 through 45 (of 144 total)

You must be logged in to reply to this topic. Login to reply