Best Practice for character replacement within Table.Column

  • Is there a better/more efficient way to do what the below does? It is  replacing several characters that are known to be problematic for this table/column(s) with a space.

    BONUS Q: IN the below example every character being replaced is being replaced with a space. What if I wanted to use a space for all but the & and = characters in which case I want to change the & to a PLUS and the = to and UNDERSCORE? Would the same solution work?

      UPDATE MM2WO 
    SET sDesc = Replace(Replace(Replace(Replace(Replace(Replace(sDesc,'&',' '),'<' , ' '),'>',' '),'!',' '), '-', ' '),'=',' ')
    WHERE 1 = 1
    AND( sDesc LIKE'%&%'
    OR sDecs LIKE'%<%'
    OR sDesc LIKE'%>%'
    OR sDesc LIKE'%!%'
    OR sDesc LIKE'%-%'
    OR sDesc LIKE'%=%'
    )

     

    CREATE TABLE [dbo].[MM2WO](
    [hMy] [NUMERIC](21, 0) IDENTITY(1,1) NOT NULL,
    [sDesc] [CHAR](35) NULL,
    CONSTRAINT [pk_mm2wo] PRIMARY KEY NONCLUSTERED

     

    • This topic was modified 10 months ago by  YSLGuru.

    Kindest Regards,

    Just say No to Facebook!
  • You don't appear to have included the solution you wanted us to review.

  • That is what the first code segment is, the current solution. The second if for the tables structure, the DDL

    • This reply was modified 10 months ago by  YSLGuru.

    Kindest Regards,

    Just say No to Facebook!
  • I couldn't see the code until I quoted the post for reply, then my post was also hard to see.  I pasted it in regular text as I couldnt make it visible.

     

      UPDATE MM2WO 

    SET sDesc = Replace(

    Replace(Replace(Replace(Replace(Replace(sDesc,'&' ,' '), '<' , ' ') ,'>' ,' ') ,'!' ,' '), '-', ' '),'=',' ')

    WHERE 1 = 1

    AND( sDesc LIKE'%&%'
    OR sDecs LIKE'%<%'
    OR sDesc LIKE'%>%'
    OR sDesc LIKE'%!%'
    OR sDesc LIKE'%-%'
    OR sDesc LIKE'%=%'
    )

    UPDATE MM2WO

    SET sDesc = Replace(

    Replace(Replace(Replace(Replace(Replace(sDesc,'&' ,' '), '<' , ' ') ,'>' ,' ') ,'!' ,' '), '-', ' '),'=',' ')

    WHERE 1 = 1

    AND( sDesc LIKE'%&%'

    OR sDecs LIKE'%<%'

    OR sDesc LIKE'%>%'

    OR sDesc LIKE'%!%'

    OR sDesc LIKE'%-%'

    OR sDesc LIKE'%=%'

    )

     

    • This reply was modified 10 months ago by  Ed B.
    • This reply was modified 10 months ago by  Ed B.
  • Gotcha. Hopefully someone like Jeff will have some clever way to improve on this as this is a regular fix I'm looking at having to do daily until our vendor can fix the issue that's inserting these problematic characters where they shouldn't be. I feel certain that some kind of temp table maybe combined with a Windows function can simplify this.

     

     

    Thanks

    Kindest Regards,

    Just say No to Facebook!
  • YSLGuru wrote:

    Is there a better/more efficient way to do what the below does? It is  replacing several characters that are known to be problematic for this table/column(s) with a space.

    BONUS Q: IN the below example every character being replaced is being replaced with a space. What if I wanted to use a space for all but the & and = characters in which case I want to change the & to a PLUS and the = to and UNDERSCORE? Would the same solution work?

     

     
    UPDATE MM2WO
    SET sDesc = Replace(Replace(Replace(Replace(Replace(Replace(sDesc,'&',' '),'<' , ' '),'>',' '),'!',' '), '-', ' '),'=',' ')
    WHERE 1 = 1
    AND( sDesc LIKE'%&%'
    OR sDecs LIKE'%<%'
    OR sDesc LIKE'%>%'
    OR sDesc LIKE'%!%'
    OR sDesc LIKE'%-%'
    OR sDesc LIKE'%=%'
    )

     

    CREATE TABLE [dbo].[MM2WO](
    [hMy] [NUMERIC](21, 0) IDENTITY(1,1) NOT NULL,
    [sDesc] [CHAR](35) NULL,
    CONSTRAINT [pk_mm2wo] PRIMARY KEY NONCLUSTERED

    I quoted your entire original post to see if the "goodies" that didn't appear in the first post would show up.  They did in my edit window and now I'm looking to see if the show up in the post that quotes them.  I really wish they'd fix the editor code and a bunch of other things that have been broken ever since the switch to Word Press years ago. 🙁

    EDIT... fascinating mess being made by the forum software on the first bit of code.  Apparently, it has some combination of characters in the code that's blowing the mind of the code window.

    Lets see if just plain text posted works...

    UPDATE MM2WO

    SET sDesc = Replace(Replace(Replace(Replace(Replace(Replace(sDesc,'&',' '),'<' , ' '),'>',' '),'!',' '), '-', ' '),'=',' ')

    WHERE 1 = 1

    AND( sDesc LIKE'%&%'

    OR sDecs LIKE'%<%'

    OR sDesc LIKE'%>%'

    OR sDesc LIKE'%!%'

    OR sDesc LIKE'%-%'

    OR sDesc LIKE'%=%'

    )

    OK 😀  That worked.  Marching on.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • All-righty then...

    You say that you want all of the special characters to be replaced by spaces except for "&" to be replaced by "+" and "=" to be replaced by a "_".

    My question would be, what's stopping you from doing that in your REPLACEs?  The answer is ... Nothing.

    UPDATE MM2WO

    SET sDesc = Replace(Replace(Replace(Replace(Replace(Replace(sDesc,'&','+'),'<' , ' '),'>',' '),'!',' '), '-', ' '),'=','_')

    WHERE 1 = 1

    AND( sDesc LIKE'%&%'

    OR sDecs LIKE'%<%'

    OR sDesc LIKE'%>%'

    OR sDesc LIKE'%!%'

    OR sDesc LIKE'%-%'

    OR sDesc LIKE'%=%'

    );

    You might be able to short-circuit the WHERE clause a bit.

    UPDATE MM2WO

    SET sDesc = Replace(Replace(Replace(Replace(Replace(Replace(sDesc,'&','+'),'<' , ' '),'>',' '),'!',' '), '-', ' '),'=','_')

    WHERE 1 = 1

    AND  sDesc LIKE'%[-&<>!=]%' ;

    Depending on your default collation, you might be able to seriously increase your speed by adding a binary collation to the occurrences of the sDesc in the formula and the AND.

    UPDATE MM2WO

    SET sDesc = Replace(Replace(Replace(Replace(Replace(Replace(sDesc COLLATE Latin1_General_BIN2,'&','+'),'<' , ' '),'>',' '),'!',' '), '-', ' '),'=','_')

    WHERE 1 = 1

    AND  sDesc COLLATE Latin1_General_BIN2 LIKE'%[-&<>!=]%' ;

    Just to demonstrate that it works...

    DECLARE @sDesc VARCHAR(100) = 'X&X<X>X!X-X=';
    SELECT Replace(Replace(Replace(Replace(Replace(Replace(@sDesc COLLATE Latin1_General_BIN2,'&','+'),'<' , ' '),'>',' '),'!',' '), '-', ' '),'=','_')
    WHERE 1 = 1
    AND  @sDesc COLLATE Latin1_General_BIN2 LIKE'%[-&<>!=]%';

    Result:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff - that's what I am doing now, just using Replace(). I wanted to know if there was a better way to do it and so far it looks like there's not a better way so much as another way. This particular update could take awhile to run due to the number of possible records it may find so I was looking for any alternatives that would provide a significant boost in performance but it's not important that we find another way.

     

    Thanks to all

    Kindest Regards,

    Just say No to Facebook!
  • Starting with SQL 2017, you can use the TRANSLATE function

  • YSLGuru wrote:

    Jeff - that's what I am doing now, just using Replace(). I wanted to know if there was a better way to do it and so far it looks like there's not a better way so much as another way. This particular update could take awhile to run due to the number of possible records it may find so I was looking for any alternatives that would provide a significant boost in performance but it's not important that we find another way.

    Thanks to all

    Jeff also suggested a refinement to the used of REPLACE() which might speed things up. Did you try it?

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Unfortunately our cloud provider still has us on SQL 2016.

     

    Thanks

    Kindest Regards,

    Just say No to Facebook!
  • YSLGuru wrote:

    Jeff - that's what I am doing now, just using Replace(). I wanted to know if there was a better way to do it and so far it looks like there's not a better way so much as another way. This particular update could take awhile to run due to the number of possible records it may find so I was looking for any alternatives that would provide a significant boost in performance but it's not important that we find another way.

    Thanks to all

    Heh... "Always look eye". 😛  Using nested REPLACEs with a BINARY COLLATION is going to be really hard to beat.

    Here's the test code...

    --===== Create and populate the test table on-the-fly
    DROP TABLE IF EXISTS #MyHead;
    SELECT TOP 1000000
    Original = CONVERT(VARCHAR(36),NEWID())
    INTO #MyHead
    FROM sys.all_columns ac1
    CROSS JOIN sys.all_columns ac2
    ;
    GO
    PRINT '--===== Prime the pumps" for the tests ===========';
    SET STATISTICS TIME ON;
    DECLARE @Result VARCHAR(36);
    SELECT @Result = Original
    FROM #MyHead;
    SET STATISTICS TIME OFF;
    GO
    PRINT '--===== Use TRANSLATE ============================';
    DECLARE @Result VARCHAR(36);
    SET STATISTICS TIME ON;
    SELECT @Result = TRANSLATE(Original,'AE-',' &')
    FROM #MyHead;
    SET STATISTICS TIME OFF;
    GO
    PRINT '--===== Use REPLACE w/Binary Collation ===========';
    DECLARE @Result VARCHAR(36);
    SET STATISTICS TIME ON;
    SELECT @Result = REPLACE(REPLACE(REPLACE(Original COLLATE Latin1_General_BIN2,'A',' '),'E',' '),'-','&')
    FROM #MyHead;
    SET STATISTICS TIME OFF;
    GO

    Here are the results...

    (1000000 rows affected)
    --===== Prime the pumps" for the tests ===========

    SQL Server Execution Times:
    CPU time = 157 ms, elapsed time = 143 ms.
    --===== Use TRANSLATE ============================

    SQL Server Execution Times:
    CPU time = 2343 ms, elapsed time = 2336 ms.
    --===== Use REPLACE w/Binary Collation ===========

    SQL Server Execution Times:
    CPU time = 1125 ms, elapsed time = 1160 ms.

    IMPORTANT OTHER CONSIDERATIONS FOR THIS PROBLEM:

    First, remember that binary collations ARE CASE/ACCENT SENSITIVE.

    If you're using nested REPLACEs with the Binary Collation and the code still runs slow, it's NOT because of the REPLACEs... there's something wrong with the rest of your code.

    Personally, I wouldn't do an UPDATE, which is going to beat the hell out of your logfile and make things slower, especially if you have indexes that will be affected.  Instead, I'd turn this into a PERSISTED Computed column.

    Either that or do the REPLACEs on the data as you add the rows to the table in the future.

    Sidebar:

    As a bit of a sidebar, this is more proof that "Change is inevitable, change for the better is not".  MS probably used some form of "Regex Replace" behind the scenes for the new TRANSLATE function and that makes it slow.  It's likely the same issue they have with the FORMAT function.

    https://www.sqlservercentral.com/articles/how-to-format-dates-in-sql-server-hint-dont-use-format

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • UPDATE MM2WO
    SET sDesc = TRANSLATE(sDesc, '&<!>!-=', ' ')
    WHERE ...

    https://learn.microsoft.com/en-us/sql/t-sql/functions/translate-transact-sql?view=sql-server-ver16

  • Jonathan AC Roberts wrote:

    UPDATE MM2WO
    SET sDesc = TRANSLATE(sDesc, '&<!>!-=', ' ')
    WHERE ...

    https://learn.microsoft.com/en-us/sql/t-sql/functions/translate-transact-sql?view=sql-server-ver16

    TRANSLATE not available in 2016.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin wrote:

    Jonathan AC Roberts wrote:

    UPDATE MM2WO
    SET sDesc = TRANSLATE(sDesc, '&<!>!-=', ' ')
    WHERE ...

    https://learn.microsoft.com/en-us/sql/t-sql/functions/translate-transact-sql?view=sql-server-ver16

    TRANSLATE not available in 2016.

    Oh yes, I was looking at the "ver16" in the help URL.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply