Weird Character in data and needs replaced

  • Hello!

    I am attempting to figure out how to identify a weird character within my data so that I can perform a replace on it.

    The string currently looks like:

    P123 GÇô Test User

    Edit: Please note that this keeps changing my special character so a notepad document is attached with it in it.

    I can figure out all of the rows that have this character in it using one of the following queries:

    SELECT * FROM dbo.table WHERE Col NOT LIKE '%' + NCHAR(45) + '%';

    SELECT * dbo.table WHERE Col NOT LIKE '%-%';

    The end goal is to update the special character to be a dash or NCHAR(45), but I dont know what to use for that special character in my REPLACE function within the UPDATE statement.

    How can I identify what this special character is in TSQL?

    Thank you!

    • This topic was modified 2 years, 10 months ago by  GBeezy.
    • This topic was modified 2 years, 10 months ago by  GBeezy.
    • This topic was modified 2 years, 10 months ago by  GBeezy.
    • This topic was modified 2 years, 10 months ago by  GBeezy.
    • This topic was modified 2 years, 10 months ago by  GBeezy.
    Attachments:
    You must be logged in to view attached files.
  • I think the easiest way is going to be by casting it to VARBINARYA(MAX) and then manually going through.

    So something like this:

    SELECT CAST('HELLO' AS VARBINARY(MAX))

    Which gives you:

    0x48454C4C4F

    0X is telling me that it is in HEX, so it can be discarded.  48 is H, 45 is E, 4C is L and 4F is O.  Every 2 characters in the VARBINARY is 1 letter in the VARCHAR.  Now, if your character string is NVARCHAR, then you end up with something more like this:

    0x480045004C004C004F00

    Here, every 4 characters is 1 character, so 4800 is H, 4500 is E and so on.

    BUT once you have it in VARBINARY, it is just a matter of narrowing things down to determine which CHAR value to use.  For example, CHAR(48) will give you an H and so on so the following:

    SELECT CHAR(0x48) + CHAR(0x45) + CHAR(0x4C) + CHAR(0x4C) + CHAR(0x4F)

    For NCHAR you need to swap the bits a bit.  So:

    SELECT NCHAR(0x0048) + NCHAR(0x0045) + NCHAR(0x004C) + NCHAR(0x004C) + NCHAR(0x004F)

    ALTERNATELY, if you don't like working with HEX in CHAR and NCHAR, you can convert those to DECIMAL.  So 48 becomes 72, 45 becomes 69, 4C becomes 76, and 4F becomes 79 so:

    SELECT CHAR(72) + CHAR(69) + CHAR(76) + CHAR(76) + CHAR(79)

     

     

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Brian is correct the only way to find out what to replace is to convert it to hex (VARBINARY).

    You cannot use the characters that you see in SSMS, notepad or any other text app as each app may interpret the characters differently.

    Your text file in notepad shows P123 GÇô Test User, the 3 characters are actually made up from two hex pairs each CE 93 C3 87 C3 B4.

    What is the datatype of the column?

    Can you post the result of the column converted to varbinary?

    Far away is close at hand in the images of elsewhere.
    Anon.

  • .

    Eddie Wuerch
    MCM: SQL

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply