Relplace all but a-zA-Z0-9

  • Hello,

    I had a relitivily sadistic query as seen in the first example to remove any invalid characters that made it into our SKU's.  This refers to any character not between A-Z and 0-9.

    update sqlsale_merch_1

    set I_SKU = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(I_SKU,'=',''),'~',''),'?',''),'|',''),'<',''),'>',''),'+',''),',',''),'[',''),']',''),';',''),':',''),'-',''),'"',''),'\',''),'/',''),'.',''),'''',''),'`','')

    where tran_date >= '2004-03-07' and tran_date <= '2004-03-14'

    and (i_sku LIKE '%`%' or i_sku LIKE '%''%' or i_sku LIKE '%.%' or i_sku LIKE '%/%' or i_sku LIKE '%\%' or i_sku LIKE '%"%' or i_sku LIKE '%-%'or i_sku LIKE '%:%' or i_sku LIKE '%;%' or i_sku LIKE '%]%' or i_sku LIKE '%[%' or i_sku LIKE '%,%' or i_sku LIKE '%+%' or i_sku LIKE '%<%' or i_sku LIKE '%>%' or i_sku LIKE '%|%' or i_sku LIKE '%~%' or i_sku LIKE '%?%' or i_sku LIKE '%=%')

    I am sure eveyone looking at that has to be laughing harder than they have ever done before, but at the time I was in a big rush and I knew that would work.  Yesterday we got a few more foreign characters that I was not taking into account so I was forced to find the "Correct" way to do this, because I can not come in every monday mourning and discover that a new character needs to be accounted for.  I found the following script, which works ok, for the most part.

    select replace(i_sku, substring(i_sku, patindex('%[^a-zA-Z0-9]%', i_sku), 1), '')

     from sqlsale_merch_1

    where tran_date >= '2002-03-07' and tran_date <= '2002-03-15'

    group by i_sku order by i_sku

    The issue is that it still contains +, Ü, and a few other accented charaters. 

    Can anyone point me to a true A-Z and 0-9 only solution.



    Patrick L. Lykins

  • create function fn_cleanup (@i varchar(4000))

    returns varchar(4000)



      declare @return varchar(4000)

      set @return = @i

      while patindex('%[^a-zA-Z0-9]%', @return) != 0


        select @return=replace(@return, substring(@return, patindex('%[^a-zA-Z0-9]%', @return), 1), '')


      return @return



    select dbo.fn_cleanup(i_sku) from sqlsale_merch_1

    where tran_date >= '2002-03-07' and tran_date <= '2002-03-15'


  • Just wondering...

    Instead of deleting those "undesirable" characters, could we replace with a character of our choice? 

    Hum.... how about replacing each undesirable character with another one (if defined) otherwise replaced with a common character ('')


  • Here's a snippet of a function I wrote to strip out any non-alpha characters from a name, such as apostrophes and hyphens.  You can add a third criterion to the If statement to include digits.  The loop copies each character to the new string if it's a letter, and ignores it and goes on to the next one if it's not alpha, using the ASCII character set.

    WHILE LEN(@ReturnText) <= LEN(@Lastname)


       SELECT @Position = @Position + 1 --move one position to the right

       IF SUBSTRING(@Lastname, @Position, 1) BETWEEN CHAR(65) AND CHAR(90) OR --upper

           SUBSTRING(@Lastname, @Position, 1) BETWEEN CHAR(97) AND CHAR(122)   --lower 


                  SELECT @ReturnText = @ReturnText + SUBSTRING(@Lastname, @Position, 1)


              IF @Position = LEN(@Lastname) BREAK ELSE CONTINUE


    Connecticut, USA

  • Okay I had the same need except I needed it to strip out leading zeroes and add digits.  I think the last one posted is the most flexible. 

    But there's one thing I'm still unsure of.   I'd like to have it return the final string with an override of the initial output specification (in this case varchar(4000) ) returning instead the length of fixed string.  Does anyone know if there's a way a function can do it 'dynamically'?  Right now I extract the return in the select statement with something like LEFT( dbo.Strip_Bad_chrs( @field) , LEN( @field) ).

    Anyway here's my latest code.

    CREATE FUNCTION Strip_Bad_Chrs(@Expression VARCHAR(4000) )

    RETURNS VARCHAR(4000)  -- Can this be made like VARCHAR(LEN(@Expression ))?



     DECLARE @ReturnText VARCHAR(4000), @Position INT

     SET @Position=0

     SET @ReturnText=''

     WHILE LEN(@ReturnText) <= LEN(@Expression)


        SELECT @Position = @Position + 1 --move one position to the right

        IF SUBSTRING(@Expression, @Position, 1) BETWEEN CHAR(65) AND CHAR(90) OR --upper

             SUBSTRING(@Expression, @Position, 1) BETWEEN CHAR(97) AND CHAR(122) OR   --lower

             (SUBSTRING(@Expression, @Position, 1) BETWEEN CHAR(48) AND CHAR(57) AND @position >1 ) OR -- include digit

             (SUBSTRING(@Expression, @Position, 1) BETWEEN CHAR(49) AND CHAR(57) AND @position =1 ) -- accept 1-9 digits for leading character (zeros stripped)


                    SELECT @ReturnText = @ReturnText + SUBSTRING(@Expression, @Position, 1)


               IF @Position = LEN(@Expression) BREAK ELSE CONTINUE


     RETURN LEFT( @ReturnText, LEN(RTRIM( @returnText)) )  --< trick doesn't work


    What do you think?


Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply