String manipulation

  • Hi,

    I want to replace some characters from a string with some other characters like following

    declare @pname varchar(255)

    set @pname='This iŠ a tËšt nãmÉ'

    select

    (replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace

    (replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace

    (replace (replace (replace(replace(replace(replace (replace (replace(replace(replace(replace (replace (replace(replace

    (replace(replace (replace (replace(replace(replace(replace (replace (replace(replace(replace(replace (replace (replace(replace

    (replace(replace (replace (replace(replace(replace(replace (replace (replace(replace(replace(replace (replace (replace(replace

    (replace(replace (replace (replace(replace(replace(replace (replace (replace(replace(replace(replace (replace (replace(replace

    (replace(replace (replace (replace(replace(replace(replace (replace (replace(replace(replace(replace(replace(replace

    (@pname,'Š','S'),'Œ','O'),'Ž','Z'),'š','s'),'œ','o'),'ž','z'),'Ÿ','Y'),'¥','Y'),'µ','u'),'À','A'),'Á','A'),'Â','A'),'Ã','A'),'Ä','A'),'Å','A'),

    'Æ','A'),'Ç','C'),'È','E'),'É','E'),'Ê','E'),'Ë','E'),'Ì','I'),'Í','I'),'Î','I'),'Ï','I'),'Ð','D'),'Ñ','N'),'Ò','O'),'Ó','O'),'Ô','O'),'Õ','O'),

    'Ö','O'),'Ø','O'),'Ù','U'),'Ú','U'),'Û','U'),'Ü','U'),'Ý','Y'),'ß','s'),'à','a'),'á','a'),'â','a'),'ã','a'),'ä','a'),'å','a'),'æ','a'),'ç','c'),

    'è','e'),'é','e'),'ê','e'),'ë','e'),'ì','i'),'í','i'),'î','i'),'ï','i'),'ð','o'),'ñ','n'),'ò','o'),'ó','o'),'ô','o'),'õ','o'),'ö','o'),'ø','o'),

    'ù','u'),'ú','u'),'û','u'),'ü','u'),'ý','Y'),'ÿ','Y'),

    '!',' '),'@',' '),'#',' '),'$',' '),'%',' '),'^',' '),'&',' '),'*',' '),'(',' '),')',' '),'{',' '),'}',' '),'[',' '),']',' '),'|',' '),'\',' '),

    ':',' '),';',' '),'<',' '),'>',' '),'.',' '),'~',' '),'?',' '),'/',' '),'-',' '),'+',' '),'=',' '),'`',' '),'_',' '),',',' '),'"',' '),'''',' '),

    '€',' '),'£','')))

    This way , the performance is slower as it contains nested function so many times.

    does anyone have any better idea to do that?

    thanks in advanced

  • I would use a table to store the character value and the substitute.

    Step two would be the replacement by calling the REPLACE function for each entry in the table and assign the result to the same variable each time.

    Usually, you could use @pname in the SELECT statement but I decided to use a separate variable for demonstration purposes (old and new value).

    DECLARE @pname VARCHAR(255)

    DECLARE @cname VARCHAR(255)

    SET @pname='This iŠ a tËšt nãmÉ'

    SET @cname=@pname

    CREATE TABLE #tbl ( c1 CHAR(1), c2 CHAR(1))

    INSERT INTO #tbl

    SELECT 'Š'AS c1, 'S' AS c2 UNION ALL

    SELECT 'È','E' UNION ALL

    SELECT 'É','E' UNION ALL

    SELECT 'Ê','E' UNION ALL

    SELECT 'Ë','E' UNION ALL

    SELECT 'à','a' UNION ALL

    SELECT 'á','a' UNION ALL

    SELECT 'â','a' UNION ALL

    SELECT 'ã','a' UNION ALL

    SELECT 'ä','a' UNION ALL

    SELECT 'å','a'

    SELECT @cname= REPLACE(@cname,c1,c2)

    FROM #tbl

    SELECT @pname AS 'BEFORE',@cname AS 'AFTER'



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • A tally table approach may be faster.

    E.g. create a tally table as in this article:

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    Create a lookup table for the find and replace strings:

    CREATE TABLE Replace_Lookup (Find CHAR(1) PRIMARY KEY , [Replace] CHAR(1))

    INSERT INTO Replace_Lookup

    SELECT 'Š' AS find,'s' AS [REPLACE]

    UNION ALL

    SELECT 'Ë' AS find,'e' AS [REPLACE]

    UNION ALL

    SELECT 'ã' AS find,'a' AS [REPLACE]

    UNION ALL

    SELECT 'É' AS find,'e' AS [REPLACE]

    UNION ALL

    SELECT 'š' AS find,'s' AS [REPLACE]

    and then use something like the below to replace all the chars:

    declare

    @pName varchar(255), @Replaced varchar(255)

    Select @Replaced = '',@pName= 'This iŠ a tËšt nãmÉ'

    SELECT @Replaced = @Replaced + ISNULL(Replace_Lookup.Replace,SUBSTRING(@pName,N,1))

    FROM Tally

    LEFT JOIN Replace_Lookup ON Replace_Lookup.Find=SUBSTRING(@pName,N,1)

    WHERE N <= DATALENGTH(@pName)

    SELECT @Replaced

    However, a CLR solution is the best approach for this type of string manipulation

  • Be careful. Depending on the collation being used you can get very odd results when comparing characters. For instance, I am using collation Latin1_General_CI_AS and the insertions into table Replace_Lookup fail because of a primary key constraint ('Ë' = 'É'). Use a binary comparison by using the COLLATE clause (i.e. REPLACE(@cname,c1 COLLATE Latin1_GENERAL_BIN,c2) or compare characters by their Ascii value (using function ASCII).

    Check out the result of this script to see which character are equal in a given collation:

    with Tally as

    (

    select top 256

    number - 1 N

    from

    master..spt_values

    where

    type = 'P'

    )

    select

    t1.N, char(t1.N), t2.N, char(t2.N)

    from

    Tally t1

    cross join

    Tally t2

    where

    t1.N <> t2.N

    and char(t1.N) = char(t2.N)

    order by

    t1.N

    Peter

  • Recently, I came across a similar post to which Jeff Moden has provided a WHILE loop based solution which beats the TALLY table solution.

    Here is the post:

    http://www.sqlservercentral.com/Forums/Topic860321-338-1.aspx

    --Ramesh


  • Ramesh Saive (2/26/2010)


    Recently, I came across a similar post to which Jeff Moden has provided a WHILE loop based solution which beats the TALLY table solution. Here is the post:

    http://www.sqlservercentral.com/Forums/Topic860321-338-1.aspx

    ...and I modified your solution in that thread to improve performance by a factor of more than twenty 😉

    The REPLACE solution is fastest. A CLR solution (from previous experience and testing) is only slightly slower, but generally more flexible.

    Paul

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply