Learn XML

  • Comments posted to this topic are about the item Learn XML

  • for those interested, you can still download the free ebook The Art of XSD By Jacob Sebastian at the Redgate books section:

    http://www.red-gate.com/community/books/art-of-xsd

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • XML is a solution looking for a problem. The big mistake is thinking that data interchange is the problem. I have a data file to send that has two formats available, csv and xml, one is quick and easy to write, export and import as the format and schema are well documented and produces a file of about 4MB. The other is slow to produce, difficult to code and the schema is hard to understand, it is also slow to import and produces a file of about 22MB. You've guessed it the first is a CSV and the second is XML.

    Data exchanges do not need to be human readable, they need to be machine readable, and for that small and fast are key, even EDIFACT is better than XML.

    This doesn't mean I haven't learnt it, one day it will find the problem it is looking to be a solution to.

  • Alex Gay (5/3/2012)


    XML is a solution looking for a problem. The big mistake is thinking that data interchange is the problem. I have a data file to send that has two formats available, csv and xml, one is quick and easy to write, export and import as the format and schema are well documented and produces a file of about 4MB. The other is slow to produce, difficult to code and the schema is hard to understand, it is also slow to import and produces a file of about 22MB. You've guessed it the first is a CSV and the second is XML.

    Data exchanges do not need to be human readable, they need to be machine readable, and for that small and fast are key, even EDIFACT is better than XML.

    This doesn't mean I haven't learnt it, one day it will find the problem it is looking to be a solution to.

    I think you are missing other real life requirements:

    * Seamless extensibility (without needing rewrites of consuming processes)

    * Serving multiple audiences with one XML stream instead of having to tailor for each and every consuming party

    Data exchage between N:M party relationships is common and favors XML. And I found coding "adaptors" to import various data streams more error prone in CSV then for XML (I have to do both often). Granted, processing simple streams of simple types is faster in CSV, but when you have more complex data to convey (data with hierarchical structure), XML easily wins in the speed, compactness and ease/speed of processing and coding.

    There is one real problem I experience with XML …. is people! Many people producing code that generates XML have no clue what they are doing, but it is that easy to produce! They are unable to model their data properly and often do not handle escaping well, causing errors in individual elements. But XML escaping is extremely simple and requires only a few find/replace actions you can do in practically every programming environment. It is truly a lack of basic knowledge that is the culprit here. I developed a toolset to work around the encoding issue by dropping those specific objects of interest that have these errors and keep processing the rest of the document.

    So, the author of the article is correct….learn it, XML in its basic form is very easy to understand and not all that hard to navigate. And if you want to see a nicely formatted readably XML document for debugging purposes, just open it with Internet Explorer (provided the file is not humongous).

  • Not to detract from XML, but from a programmer's perspective you better also learn JSON as well since some databases (MongoDB, CouchDB, Persevere) and services/frameworks are leaning in that direction instead of XML since it is more compact and you may have to write/use apps/services that parse XML to JSON and vice versa.

    It's all relative to the size of your programming/database staff and how many hats you are required to wear in a day of course.

    "There is nothing so useless as doing efficiently that which should not be done at all." - Peter Drucker

  • Acconding to some of the local gurus, having Extended Events in XML makes it harder to build automation and parsing scripts, since you never know exactly what "fields" you need to plan for...So you can semi-automate the process.

  • XML is good for exchanging data at the application level, but it's not very efficient for data storage in the database or exchange between two databases. That's why most DBA's are unfamiliar with it. In those situations where a DBA does confront XML, for example when an application is serializing objects in the database, it usually within a negative context; for example CPU spikes and bloated data storage. Most DBAs would rightlyfully rather be dealing with tabular data in relational tables.

    Exposing things like extended events to the outside world as XML makes a lot of sense, but there are tools (like SSMS itself) that will do the work of parsing the tags and presenting it in a visual form. An XML document that contains complex objects like query execution plans or SSIS packages (.dtsx files) may be human readble when opened in Windows Notepad or XML Notepad, but really there is no reason for a DBA to be digging into it the tags, unless he's debugging something or just curious. I've resorted to poking around in XML documents myself on occasion; searching for connection strings or even manually fixing some property value that I had no luck resolving within the SSIS package editor. However, XML is self explanatory, there really is not much to learn. Knowing XSDs or XSLTs are not really required to read the underlying XML.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • peter-757102 (5/3/2012)


    They are unable to model their data properly and often do not handle escaping well, causing errors in individual elements. But XML escaping is extremely simple and requires only a few find/replace actions you can do in practically every programming environment. It is truly a lack of basic knowledge that is the culprit here. I developed a toolset to work around the encoding issue by dropping those specific objects of interest that have these errors and keep processing the rest of the document.

    Indeed, as an XML novice I found escaping an extremely annoying and frustrating task. (This was to produce XML to be read by PHP for displaying on a website.) In addition, I couldn't find an easy way to write out XML from SQL to a file with an included XML header, so I have used a rather clunky work-around.

    Would you have any good links/references about how to handle this?

    Thanks,

    Rich

  • charles.byrne (5/3/2012)


    Not to detract from XML, but from a programmer's perspective you better also learn JSON as well since some databases (MongoDB, CouchDB, Persevere) and services/frameworks are leaning in that direction instead of XML since it is more compact and you may have to write/use apps/services that parse XML to JSON and vice versa.

    It's all relative to the size of your programming/database staff and how many hats you are required to wear in a day of course.

    I agree with you, Charles. I, too, am a developer. I see the importance of XML; it is used in a lot of situations and services. But JSON is very strong; perhaps stronger than XML, at least for transferring data over the wire.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • rmechaber (5/3/2012)


    peter-757102 (5/3/2012)


    They are unable to model their data properly and often do not handle escaping well, causing errors in individual elements. But XML escaping is extremely simple and requires only a few find/replace actions you can do in practically every programming environment. It is truly a lack of basic knowledge that is the culprit here. I developed a toolset to work around the encoding issue by dropping those specific objects of interest that have these errors and keep processing the rest of the document.

    Indeed, as an XML novice I found escaping an extremely annoying and frustrating task. (This was to produce XML to be read by PHP for displaying on a website.) In addition, I couldn't find an easy way to write out XML from SQL to a file with an included XML header, so I have used a rather clunky work-around.

    Would you have any good links/references about how to handle this?

    Thanks,

    Rich

    There are several ways, but so far I only had to generate XML from the application layer and not SQL. Once you know how and when to escape data, the problem is reduced to nothing more then concatenating strings in a natively efficient manner when using this method. For SQL specifically there are other ways as well, some likely more efficient (and complex), but I got no experience with them.

    You could start with SELECT and the FOR XML clause for simple set representations in XML, but the output does not necessarily match what you need to generate (XML works case sensitive for example).

    Documentation for the FOR XML clause: http://msdn.microsoft.com/en-us/library/ms190922(v=sql.100).aspx

    As for exporting data to a file, that is something I found always lacking from T-SQL. Importing is easy and has fairly good support, exporting not so...it would be my number one wish for any new release! I hate DTS and any such overly complex replacement and just expect to be able to specify an output file in my SQL clause and be done with it (for the sake of both sanity and efficiency).

    I assume you know the 5 characters that you need to replace? (this forum did not allow me to post them as it generated an error)

  • peter-757102 (5/3/2012)


    rmechaber (5/3/2012)


    peter-757102 (5/3/2012)


    They are unable to model their data properly and often do not handle escaping well, causing errors in individual elements. But XML escaping is extremely simple and requires only a few find/replace actions you can do in practically every programming environment. It is truly a lack of basic knowledge that is the culprit here. I developed a toolset to work around the encoding issue by dropping those specific objects of interest that have these errors and keep processing the rest of the document.

    Indeed, as an XML novice I found escaping an extremely annoying and frustrating task. (This was to produce XML to be read by PHP for displaying on a website.) In addition, I couldn't find an easy way to write out XML from SQL to a file with an included XML header, so I have used a rather clunky work-around.

    Would you have any good links/references about how to handle this?

    Thanks,

    Rich

    There are several ways, but so far I only had to generate XML from the application layer and not SQL. Once you know how and when to escape data, the problem is reduced to nothing more then concatenating strings in a nativly efficient manner when using this method. For SQL specifically there are other ways as well, some likely more efficient (and complex), but I got no experience with them.

    You could start with SELECT and the FOR XML clause for simple set representations in XML, but the output does not neccecarily match what you need to generate (XML works case sensitive for example).

    Documentation for the FOR XML clause: http://msdn.microsoft.com/en-us/library/ms190922(v=sql.100).aspx

    As for exporting data to a file, that is something I fould always lacking from T-SQL. Importing is easy and has faily good support, exporting not so...it would be my number one wish for any new release! I hate DTS and any such overly complex replacement and just expect to be able to specify an output file in my SQL clause and be done with it (for the sake of both sanity and efficiency).

    I assume you know the 5 characters that you need to replace (this forum did not allow me to post them as it generated an error)?

    I've found articles that mention using BCP to export an XML resultset to a file.

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=9336&whichpage=2

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I've been using XML in various database-related solutions for several years. Started with SQL 2005, really.

    The main problem I've run into with it has been very, very poor documentation. The SQL Server XQuery documentation started out being worse than useless when I first needed to dig into it. Circular definitions, lack of examples, examples that didn't actually work, and a writing style that assumed you already knew everything there was to know about XML and just need the specifics to Microsoft's implementation of XQuery functions.

    The W3C documentation on XML largely assumes you already know everything about XML and are looking to compare implementations to standards, not looking to figure out what the heck the standards mean in the first place.

    I think a lot of the problem with XML is that its been documented by people who are very good at getting computers to communicate with each other, but who haven't the faintest clue of how to communicate with other people. Hate to say that, but it really feels like that.

    Over the years since I first started dealing with it, I've noticed the documentation has improved steadily in quality, and in quantity. It's a lot easier to find what I need these days than when I started with it.

    But it still very much depends on already knowing the subject pretty well before you can even search for answers. If you don't know the term "nodes function", you really can't Google/Bing/whatever "how do I separate rows of XML into rows of tabular data" and get anything that will help you out. Try both searches, see what you get.

    It's moved from "you have to be an expert to even ask a question" to "you have to know some of the terminology before you can ask a question". That's a huge improvement, but it's still got a ways to go.

    Where I'm at? I know XML in SQL Server well enough to solve some interesting problems in efficient ways, but it's not as easy as it should be.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Eric M Russell (5/3/2012)


    I've found articles that mention using BCP to export an XML resultset to a file.

    http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=9336&whichpage=2

    I know of bcp, and it is (convenience wise) for export a far cry from what BULK INSERT is in T-SQL for imports.

  • I think XML has its place but it has been over-hyped and inappropriately used in various situations.

    The product requirements should clearly indicate that using XML (for whatever purpose) is advantageous or necessary.

    The probability of survival is inversely proportional to the angle of arrival.

  • I agree Steve, that if you are currently a DBA, it is only going to benefit you to know XML not just only now, but also definitely down the road as well. I live in Miami/Fort Lauderdale and for the same exact reasons it definitely benefits me now, as well as in the future, to learn Spanish. 😀

    "Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply