Updating a Text field to XML and its impact on the DB and log-shipping - Help

  • Hi everyone,

    I am due to make a change to our 2005 Production DB. This change involves updating a Text field to XML.

    The questions are:

    •what impact will this have on both databases regarding size/growth and how would I measure and manage this

    •what is the impact to any log shipping to our disaster recovery database as I expect it to grow substantially?

    Can anyone point me in the right direction.

    Many thanks, Phil

  • phil.layzell (5/8/2012)


    Hi everyone,

    I am due to make a change to our 2005 Production DB. This change involves updating a Text field to XML.

    The questions are:

    •what impact will this have on both databases regarding size/growth and how would I measure and manage this

    XML is stored as UTF-16LE internally which occupies (for discussion purposes) double the space of a standard ANSI column for the same number of characters.

    •what is the impact to any log shipping to our disaster recovery database as I expect it to grow substantially?

    Space should be a consideration, especially on the initial column change as it will generate a lot of log activity. How much data are you talking about converting? Log shipping will happily send the schema and any subsequent XML data to your secondary servers via the log restore process. There are no restrictions or limitations that I know of as far as that goes.

    I would plan on checking indexes on all the changed tables after the change, and defragmenting as needed.

    Have you checked all the text to make sure it is valid XML? I have had issues with that when trying to convert an existing column, and it can get tricky when some rows have some poorly formed data.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Excellent, thanks very much for this.

    We are looking at 301,066,784 KB of data so I am expecting this to double. I’ll also look at creating a Pre-prod environment to test this against and check for any poorly formed XML. I let you know how it turns out 🙂 If you think of anything else please let me know.

    Thanks again.

  • I created a test env and , as you warned me, once I update the column I get the following message for a badly-formed xml fragment:

    Msg 9412, Level 16, State 1, Line 1

    XML parsing: line 1, character 43679, '>' expected

    The statement has been terminated.

    Can I turn off the checking during the update, or is it not possible to

    add badly formed data to the xmldata field. Or should I be looking at creating an SSIS package instead? Any help appreciated as the table has 2,005,794 rows, so I can't really check

    the contents of each! 🙂

  • On SQL 2005 your only option is to check each :angry:

    If you have an instance of SQL 2012 around it can be used to identify the bad XML using the new function TRY_CAST...but, you'll still need to do something with the invalid data as you cannot change the column to XML until you do. In my case I escaped it all and stored it as a simple string inside a root named named <invalidXml>, or something like that. By escaped I mean I changed all < to &lt; and such to make it valid XML.

    Edit: escape &lt; for display

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • An option on SQL 2005 would be a SQLCLR function similar to TRY_CAST specifically to handle XML, and return the original string if it is valid, else return escaped XML to store. It would not be a tough function to write.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Now I have a great excuse to use 2012. Thanks for all your help today. :satisfied:

  • Anytime 🙂 If you happen to check out TRY_CAST and have a moment to spare, could you post back for later viewers to know how you got along with it?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Happy to.

    The TRY_CAST returns a value cast to the specified data type if the cast succeeds, if not it returns null. Which solves my problem by immediately telling me which rows if any have the badly formed XML.

    Using :

    select TRY_CAST([LetterText] AS xml ) from tLetter

    Returns :

    NULL – Badly formed XML

    <?MSO-APPLICATION….ETC - Well formed XML

    However, when the cast is explicitly not permitted like so:

    SELECT TRY_CAST(999 AS XML) AS ResultSet

    It will return:

    Msg 529, Level 16, State 2, Line 1

    Explicit conversion from data type int to xml is not allowed.

    This function is definitely a plus as I am sure there will be some useful applications for this.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply