Lookup table for accented European characters

  • My ETL process involves importing accented European text characters such as German letters with umlauts over the vowels.  However the characters with accents are coming in like this:

    +£ instead of ö
    +ñ instead of ä
    +¦ instead of ó

    What I want to do is use a T-SQL script to convert the imported text to what it should be.  So I want my script to convert Rudy M+£eller to Rudy Möeller.

    I think the best way to do this is with a lookup table, but I don't know what the name of the text I'm converting from is.

    I would be grateful if someone who knows could suggest what kind of lookup table I should use, or could point me towards where I can find such a table.

  • It doesn't really need to be a lookup table at all.  What if there were multiple such substitutions needed for a given data column?   Having it as a lookup table would require advance knowledge of which character pairs needed replacing, and that's not going to be efficient.   Easier to just use three REPLACE functions back to back, one for each possible replacement that may be needed.  You always do all 3 because it's less costly than having to pass through the data twice.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • caspersql - Tuesday, August 7, 2018 9:43 AM

    My ETL process involves importing accented European text characters such as German letters with umlauts over the vowels.  However the characters with accents are coming in like this:

    +£ instead of ö
    +ñ instead of Ã¤
    +¦ instead of Ã³

    What I want to do is use a T-SQL script to convert the imported text to what it should be.  So I want my script to convert Rudy M+£eller to Rudy Möeller.

    I think the best way to do this is with a lookup table, but I don't know what the name of the text I'm converting from is.

    I would be grateful if someone who knows could suggest what kind of lookup table I should use, or could point me towards where I can find such a table.

    Have you tried using unicode on your ETL? That should fix the problem.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares - Tuesday, August 7, 2018 11:19 AM

    Have you tried using unicode on your ETL? That should fix the problem.

    This worked, thanks Luis.  For some reason the BCP import was reading my source CSV file as non unicode even when I specified the widechar variable in T-SQL.  To fix that I opened the file in excel and saved it as a unicode txt file.  When I now import the new unicode txt file it imports correctly as unicode and all accented characters are imported as they should be.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply