MAJOR ISSUE TO FIX ASAP: data loss for characters not included in code page 1252

  • I have mentioned this in two others posts but need to make it very clearly visible given that fixing the issue will not retroactively fix the mangled data (unlike fixing other display issues, such as bulleted / numbered lists now appearing correctly):

     

    Characters that are not found on Code Page 1252 are not stored correctly. They are at best re-mapped to something similar via "best fit" mappings, or at worst are converted to the default replacement character: "?".

     

    Content that was ported from the old site seems to have come over correctly. BUT, any new content, either submitted via the website or even imported (e.g. blog posts). The following list of characters should be the only non-standard ASCII characters that save correctly based on my testing thus far (they are the ASCII Extended set -- values 128 - 255 -- via Code Page 1252):

     

    €  ‚ ƒ „ … † ‡ ˆ ‰ Š ‹ Œ  Ž   ‘ ’ “ ” • – — ˜ ™ š › œ  ž Ÿ   ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ­ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿ

     

    There should only be 5 ""s in that set as there are 5 undefined values. That list was generated using the following T-SQL:

     

    DECLARE @CP1252characters VARCHAR(300) = '';

    SELECT TOP (128)
    @CP1252characters += ' ' + CHAR(ROW_NUMBER() OVER (ORDER BY @@MICROSOFTVERSION) + 127)
    FROM master.sys.columns;

    PRINT @CP1252characters;

    As I mentioned in the "New blog import process mangles blog content" issue, there is clearly a VARCHAR parameter and/or variable and/or column being used somewhere in the saving process.

     

    Take care,

    Solomon...

     

    P.S. If you are unable to find the issue quickly, I am available for consulting. I'm not being flippant or sarcastic or anything that might be perceived as negative or presumptuous here. Character encoding is a highly complex and tricky topic that trips up most computer folk, even the very intelligent / talented ones. But, it's something I have been researching / specializing in for 6 years now, and am able to assist with if needed / wanted 😺 .

     

    P.P.S. This is actually a re-post. Against my better judgement, I made the mistake of editing my original post and lost it: https://www.sqlservercentral.com/forums/topic/major-issue-to-fix-asap-data-loss-for-characters-not-included-in-code-page-1252

     

    P.P.P.S I am tempting fate here by editing this post (2nd edit, actually), but I am feeling daring 😼

     

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

  • Noted - we'll dig in as soon as we can and get back to you.

  • This issue was ultimately the result of string literals not being prefixed with an upper-case "N" in the T-SQL code generated by the PHP code (of ProjectNami). I submitted a fix for that to ProjectNami and it was recently merged (hence, official):

    https://github.com/ProjectNami/projectnami/pull/422

     

    Fixing that issue on SSC.com, fortunately, does not require any extensive work or upgrade of ProjectName. It only requires adding that single-character to the same line of code (even if that line is at a different line number in the SSC.com version of ProjectNami). Please see the following SSC.com forum post for details:

    https://www.sqlservercentral.com/forums/topic/can-post-please-be-stored-in-nvarchar/page/2#post-3929480

     

    Take care,

    Solomon...

     

    SQL#https://SQLsharp.com/ ( SQLCLR library ofover 340 Functions and Procedures)
    Sql Quantum Lifthttps://SqlQuantumLift.com/ ( company )
    Sql Quantum Leaphttps://SqlQuantumLeap.com/ ( blog )
    Info sitesCollations     •     Module Signing     •     SQLCLR

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply