Converting Text to XML

  • I'm pulling data out of a DB where they decided to store XML data with a text data-type. I'm trying to get it back to XML so I can parse it, but I'm getting the following error:

    Msg 9402, Level 16, State 1, Line 1

    XML parsing: line 1, character 56, unable to switch the encoding

    I don't know much at all about this, but I have tried the following:

    SELECTCONVERT(XML,xml_field)

    FROMsource_table

    SELECTCONVERT(XML,CONVERT(VARCHAR(MAX),xml_field))

    FROMsource_table

  • You are probably running into an issue with the encoding attribute of the XML declaration tag within your strings. If it looks like this:

    <?xml version="1.0" encoding="UTF-8" ?>

    You either need to remove it from the string before trying to CAST to XML or you need to change the attribute value to something SQL Server can handle converting, e.g. UTF-16

    <?xml version="1.0" encoding="UTF-16" ?>

    Either way SQL Server is going to drop the declaration tag before storing it as XML so if there are attributes in the tag you need then leaving it as a string may be your best option.

    From http://msdn.microsoft.com/en-us/library/ms345115(v=sql.90).aspx under XML Storage Options:

    It may not be an exact copy of the text XML, since the following information is not retained: insignificant white spaces, order of attributes, namespace prefixes, and XML declaration.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Hi,

    I know I'm a little late to the table here (no pun intended), but I came across this thread because I was getting the same error message. I figured out the problem in my case, and I think I can add a helpful note here.

    If your source is a [text] or [varchar](max) field, and you get the "unable to switch the encoding" error message, that means that the XML text starts with "<?xml version="1.0" encoding="UTF-16" ?>". In this case just cast it to [ntext] or [nvarchar](max). So, relative to Dave's original posting, changing the CONVERT to NVARCHAR()should work:

    SELECTCONVERT(XML,CONVERT(NVARCHAR(MAX),xml_field))

    FROMsource_table

    If your source is an [ntext] or [nvarchar](max) field, and you get the "unable to switch the encoding" error message, that means that the XML text starts with "<?xml version="1.0" encoding="UTF-8" ?>". In this case just cast it to [text] or [varchar](max).

    Basically, if the source XML says "UTF-8", and you want/need it to go through a character data type, the data type has to be [varchar] or [text] (both of which use 1 byte, 8 bit, encoding). If the XML says "UTF-16", and you want/need it to go through a character data type, the data type has to be [nvarchar] or [ntext] (both of which use 2 byte, 16 bit, encoding).

    Sincerely,
    Daniel

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply