Msg 9420, Level 16, State 1, Line 1 XML parsing:

  • Microsoft SQL Server 2008 (RTM) - 10.0.1600.22 (X64) Jul 9 2008 14:17:44 Copyright (c) 1988-2008 Microsoft Corporation Standard Edition (64-bit) on Windows NT 6.0 <X64> (Build 6002: Service Pack 2) (VM)

    VM has 4 processors assigned

    ----------------------------------------

    I have a table where XML docs are stored in a TEXT column. All xml is validated against an XSD before it is inserted or updated to the database (ASP.NET app).

    We want to start using full text search, but the iFilter on a TEXT column cannot be overridden for full text indexing to recognize the column as XML data. There is significant development that would have to be done if we just changed the datatype to XML. Therefore, we decided just to duplicate the data as an XML type in another column or table. The data is sensitive, so I can't really publish it.

    Table schema:

    CREATE TABLE XmlReport

    (

    ReportId INT NOT NULL IDENTITY (1,1) PRIMARY KEY

    , XmlReport TEXT NULL

    )

    When I run the query below, I qet the following error:

    Msg 9420, Level 16, State 1, Line 1

    XML parsing: line 13, character 959, illegal xml character

    SELECT ReportId, CAST(XmlReport AS XML) TrueXml

    FROM TF_XMLReports

    WHERE XmlReport IS NOT NULL

    I can click on the offending XML column in the results tab and SQL Server opens the XML doc in a new window without complaining. I can copy and paste the offending data in an online XML validator with no problem.

    I can one by one filter out line 13 (row 13) and repeat the process. The error is always on line 13, character 959 even though the ReportId on line 13 is different than the one I filtered out. And, I get different rowcounts. Below is the filter I was using to one by one filter out offending rows, the commented numbers after them are the rowcounts after each was subsequently filtered. When I didn't use any filter, the query returned 16 rows:

    AND ReportId NOT IN

    (

    112390 -- 13

    --, 112313 --15

    --, 112301 --14

    --, 112299 -- 12

    --, 112287 --11

    --, 102123 -- 10

    --, 95032 -- 9

    )

    I tried various filters on the ReportId, and got similar results.

    I've used (MAXDOP 1) to make sure I only had one thread running, and got similar results.

    Can someone tell me what the heck is going on here? Is this some obscure Win 2008 64/ SQL Server 64/ VM bug?

  • If you can't post the real data that's causing the problem you'd need to come up with some modified sample data that will show the same effect.

    We need to be able to reproduce that effect on our side in order to be sure that the solution we might come up with will actually resolve your issue.

    What you could do is to modify the data in line 13 before character 959 (e.g. replace "Smith" with "Smith1" and another test with "Smit") and check if the character that fails will move to 960 rsp. 958. In that case you'd need to get to that character and see what it is...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Unfortunately, I think I will have to open a Microsoft incident, aaarrrgghhh.

    The unfiltered query returns 16 rows, out of 3,800, before erroring out, but it tells me the offending XML is on row 13.

    SELECT

    ReportId

    , CAST(XmlReport AS XML) TrueXml

    FROM TF_XMLReports

    WHERE XmlReport IS NOT NULL

    ORDER BY ReportId

    OPTION (MAXDOP 1)

    This is a truncated version of the data:

    RowReportIdTrueXML

    183228<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    294159<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    394160<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    494161<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    594208<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    694211<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    795028<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    895030<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    995031<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    1095032<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    11102123<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    12112287<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    13112299<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    14112301<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    15112313<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    16112390<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

    When I run this query which returns only the offending row, the resultset is retruned without error:

    SELECT

    ReportId

    , CAST(XmlReport AS XML) TrueXml

    FROM TF_XMLReports

    WHERE XmlReport IS NOT NULL

    AND ReportId = 112299

    ORDER BY ReportId

    OPTION (MAXDOP 1)

    RowReportIdTrueXML

    1112299<Report xmlns:xsi="http://www.w3.org/2001/XMLSchem

  • Ok, I finally found out what was going on. The line number that was included with the message had no bearing on the real issue as we strip all whitespace from our XML which includes CR and LF. So, it was misleading. The real problem is the curly quotes that Word inserts through Autocorrect. For some reason, SQL Server doesn't like the curly quotes and considers them invalid XML, even though they validate in Visual Studio and online XML validators.

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply