Relplace all but a-zA-Z0-9

  • Hello,

    I had a relitivily sadistic query as seen in the first example to remove any invalid characters that made it into our SKU's.  This refers to any character not between A-Z and 0-9.

    update sqlsale_merch_1

    set I_SKU = REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(I_SKU,'=',''),'~',''),'?',''),'|',''),'<',''),'>',''),'+',''),',',''),'[',''),']',''),';',''),':',''),'-',''),'"',''),'\',''),'/',''),'.',''),'''',''),'`','')

    where tran_date >= '2004-03-07' and tran_date <= '2004-03-14'

    and (i_sku LIKE '%`%' or i_sku LIKE '%''%' or i_sku LIKE '%.%' or i_sku LIKE '%/%' or i_sku LIKE '%\%' or i_sku LIKE '%"%' or i_sku LIKE '%-%'or i_sku LIKE '%:%' or i_sku LIKE '%;%' or i_sku LIKE '%]%' or i_sku LIKE '%[%' or i_sku LIKE '%,%' or i_sku LIKE '%+%' or i_sku LIKE '%<%' or i_sku LIKE '%>%' or i_sku LIKE '%|%' or i_sku LIKE '%~%' or i_sku LIKE '%?%' or i_sku LIKE '%=%')

    I am sure eveyone looking at that has to be laughing harder than they have ever done before, but at the time I was in a big rush and I knew that would work.  Yesterday we got a few more foreign characters that I was not taking into account so I was forced to find the "Correct" way to do this, because I can not come in every monday mourning and discover that a new character needs to be accounted for.  I found the following script, which works ok, for the most part.

    select replace(i_sku, substring(i_sku, patindex('%[^a-zA-Z0-9]%', i_sku), 1), '')

     from sqlsale_merch_1

    where tran_date >= '2002-03-07' and tran_date <= '2002-03-15'

    group by i_sku order by i_sku

    The issue is that it still contains +, Ü, and a few other accented charaters. 

    Can anyone point me to a true A-Z and 0-9 only solution.

    Thanks,

    Patrick


    Patrick L. Lykins

  • create function fn_cleanup (@i varchar(4000))

    returns varchar(4000)

    as

    begin

      declare @return varchar(4000)

      set @return = @i

      while patindex('%[^a-zA-Z0-9]%', @return) != 0

      begin

        select @return=replace(@return, substring(@return, patindex('%[^a-zA-Z0-9]%', @return), 1), '')

      end

      return @return

    end

    go

    select dbo.fn_cleanup(i_sku) from sqlsale_merch_1

    where tran_date >= '2002-03-07' and tran_date <= '2002-03-15'

     

  • Just wondering...

    Instead of deleting those "undesirable" characters, could we replace with a character of our choice? 

    Hum.... how about replacing each undesirable character with another one (if defined) otherwise replaced with a common character ('')

     

  • Here's a snippet of a function I wrote to strip out any non-alpha characters from a name, such as apostrophes and hyphens.  You can add a third criterion to the If statement to include digits.  The loop copies each character to the new string if it's a letter, and ignores it and goes on to the next one if it's not alpha, using the ASCII character set.

    WHILE LEN(@ReturnText) <= LEN(@Lastname)

    BEGIN

       SELECT @Position = @Position + 1 --move one position to the right

       IF SUBSTRING(@Lastname, @Position, 1) BETWEEN CHAR(65) AND CHAR(90) OR --upper

           SUBSTRING(@Lastname, @Position, 1) BETWEEN CHAR(97) AND CHAR(122)   --lower 

              BEGIN

                  SELECT @ReturnText = @ReturnText + SUBSTRING(@Lastname, @Position, 1)

              END

              IF @Position = LEN(@Lastname) BREAK ELSE CONTINUE

    END



    Dana
    Connecticut, USA
    Dana

  • Okay I had the same need except I needed it to strip out leading zeroes and add digits.  I think the last one posted is the most flexible. 

    But there's one thing I'm still unsure of.   I'd like to have it return the final string with an override of the initial output specification (in this case varchar(4000) ) returning instead the length of fixed string.  Does anyone know if there's a way a function can do it 'dynamically'?  Right now I extract the return in the select statement with something like LEFT( dbo.Strip_Bad_chrs( @field) , LEN( @field) ).

    Anyway here's my latest code.

    CREATE FUNCTION Strip_Bad_Chrs(@Expression VARCHAR(4000) )

    RETURNS VARCHAR(4000)  -- Can this be made like VARCHAR(LEN(@Expression ))?

    AS

    BEGIN

     DECLARE @ReturnText VARCHAR(4000), @Position INT

     SET @Position=0

     SET @ReturnText=''

     WHILE LEN(@ReturnText) <= LEN(@Expression)

     BEGIN

        SELECT @Position = @Position + 1 --move one position to the right

        IF SUBSTRING(@Expression, @Position, 1) BETWEEN CHAR(65) AND CHAR(90) OR --upper

             SUBSTRING(@Expression, @Position, 1) BETWEEN CHAR(97) AND CHAR(122) OR   --lower

             (SUBSTRING(@Expression, @Position, 1) BETWEEN CHAR(48) AND CHAR(57) AND @position >1 ) OR -- include digit

             (SUBSTRING(@Expression, @Position, 1) BETWEEN CHAR(49) AND CHAR(57) AND @position =1 ) -- accept 1-9 digits for leading character (zeros stripped)

                BEGIN                  

                    SELECT @ReturnText = @ReturnText + SUBSTRING(@Expression, @Position, 1)

               END

               IF @Position = LEN(@Expression) BREAK ELSE CONTINUE

               END

     RETURN LEFT( @ReturnText, LEN(RTRIM( @returnText)) )  --< trick doesn't work

    END

    What do you think?

     

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply