Using REPLACE with a string where I need to replace %anystring% with some new string

  • Hi everyone,

    I am actually fairly new to using the REPLACE function and don't know if I can use regex with it. The string that I want to replace is always between 2 % characters and includes the % characters.

    Sample data:

    SELECT 'somestring+%sometext%' AS string

    INTO #temp

    UNION ALL

    SELECT '%sometext%'

    UNION ALL

    SELECT 'somestring+%sometext%+somesecondstring'

    So I would like the results when selected from the tale to look like this (no update needed, this is just a query):

    SELECT 'somestring+replaced' AS newString

    UNION ALL

    SELECT 'replaced'

    UNION ALL

    SELECT 'somestring+replaced+somesecondstring'

    So, can I use the REPLACE function to find the '%whatever%' string?

    Jared
    CE - Microsoft

  • I could be wrong, but I'm pretty sure Replace() won't take regex or other patterns. I'm pretty sure it only deals with strings or string-type variables with pre-defined values.

    However, solving this with Stuff() would be quite easy.

    SELECT string,

    CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1),

    STUFF(string, CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1, 'replaced')

    FROM (SELECT 'somestring+%sometext%' AS string

    UNION ALL

    SELECT '%sometext%'

    UNION ALL

    SELECT 'somestring+%sometext%+somesecondstring') AS Example;

    The first three columns are just the internal pieces of the logic. The final column is what you need for this kind of thing.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (8/13/2012)


    I could be wrong, but I'm pretty sure Replace() won't take regex or other patterns. I'm pretty sure it only deals with strings or string-type variables with pre-defined values.

    However, solving this with Stuff() would be quite easy.

    SELECT string,

    CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1),

    STUFF(string, CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1, 'replaced')

    FROM (SELECT 'somestring+%sometext%' AS string

    UNION ALL

    SELECT '%sometext%'

    UNION ALL

    SELECT 'somestring+%sometext%+somesecondstring') AS Example;

    The first three columns are just the internal pieces of the logic. The final column is what you need for this kind of thing.

    Interesting... Thanks G. I think you are right that REPLACE() does not accept patterns. That's why my RegEx wasn't working 🙂

    Jared
    CE - Microsoft

  • Does this not work for you?

    select REPLACE(yourstring '%sometext%', 'replaced')

    from #temp

  • Alternatively, if you really need to use Replace instead of Stuff for this, try using Substring to get the piece that you want to replace. Build the Substring parameters the same way I did the Stuff position and length. Then use that as the first argument for the Replace function.

    SELECT string,

    SUBSTRING(string, CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1),

    REPLACE(string,

    SUBSTRING(string, CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1),

    'replaced')

    FROM (SELECT 'somestring+%sometext%' AS string

    UNION ALL

    SELECT '%sometext%'

    UNION ALL

    SELECT 'somestring+%sometext%+somesecondstring') AS Example;

    Either will work. Just thought of the nested Replace(Substring()) version second is all.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • @rory: Sorry I was not clear that sometext could be anything; i.e. %sometext%, %hogwash%, '%palinka%'... So your proposed solution will not work.

    @g: No need to use replace, I think your STUFF solution will work just fine. Thanks again!

    Jared
    CE - Microsoft

  • roryp 96873 (8/13/2012)


    Does this not work for you?

    select REPLACE(yourstring '%sometext%', 'replaced')

    from #temp

    I think the point is to replace any text between %.

    --Vadim R.

  • But based on the desired output he provided it looked like he wanted the % signs removed as well. Or did I misread that?

    Edit: Nevermind, I get it now. You won't know what the string is in advance, you are just looking for % and want to replace what is in there. Sorry for the confusion.

  • roryp 96873 (8/13/2012)


    But based on the desired output he provided it looked like he wanted the % signs removed as well. Or did I misread that?

    Edit: Nevermind, I get it now. You won't know what the string is in advance, you are just looking for % and want to replace what is in there. Sorry for the confusion.

    Yes, but your were right first with the %'s themselves; i.e. they should also be replaced.

    Jared
    CE - Microsoft

  • Playing around with PATINDEX, which is the closest thing you'll find to REGEX (except the CLR REGEX functions of course), I come up with something like this:

    ;WITH SampleData (string) AS (

    SELECT 'somestring+%sometext%'

    UNION ALL

    SELECT '%sometext%'

    UNION ALL

    SELECT CAST('somestring+%sometext%+somesecondstring' AS VARCHAR(100))

    )

    SELECT string, StringReplaced=STUFF(string, [First], 1+[Second], 'replaced')

    FROM SampleData

    CROSS APPLY (SELECT [First]=PATINDEX('%[%]%', string)) a

    CROSS APPLY (SELECT [Second]=PATINDEX('%[%]%', SUBSTRING(string, 1+[First], LEN(string)))) b

    Not sure if this helps though.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (8/14/2012)


    Playing around with PATINDEX, which is the closest thing you'll find to REGEX (except the CLR REGEX functions of course), I come up with something like this:

    ;WITH SampleData (string) AS (

    SELECT 'somestring+%sometext%'

    UNION ALL

    SELECT '%sometext%'

    UNION ALL

    SELECT CAST('somestring+%sometext%+somesecondstring' AS VARCHAR(100))

    )

    SELECT string, StringReplaced=STUFF(string, [First], 1+[Second], 'replaced')

    FROM SampleData

    CROSS APPLY (SELECT [First]=PATINDEX('%[%]%', string)) a

    CROSS APPLY (SELECT [Second]=PATINDEX('%[%]%', SUBSTRING(string, 1+[First], LEN(string)))) b

    Not sure if this helps though.

    Nice twist Dwain! I'll take a look at performance and execution plans between the both of these when I into work today.

    Jared
    CE - Microsoft

  • So I set up a quick test using this setup:

    USE test

    GO

    CREATE TABLE testStringReplace (string VARCHAR(250))

    INSERT INTO testStringReplace

    SELECT TOP 500000 c1.name + '%' + c1.name + '%'

    FROM sys.columns c1

    CROSS JOIN sys.columns c2

    GO

    --SET STATISTICS IO ON

    --SET STATISTICS TIME ON

    SELECT string, StringReplaced=STUFF(string, [First], 1+[Second], 'replaced')

    FROM testStringReplace

    CROSS APPLY (SELECT [First]=PATINDEX('%[%]%', string)) a

    CROSS APPLY (SELECT [Second]=PATINDEX('%[%]%', SUBSTRING(string, 1+[First], LEN(string)))) b

    SELECT string,

    REPLACE(string,

    SUBSTRING(string, CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1),

    'replaced')

    FROM testStringReplace

    SELECT string,

    STUFF(string, CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1, 'replaced')

    FROM testStringReplace

    GO

    The actual execution plans are exactly the same for each and IO is the same. Interestingly enough... Every time I run this, the third one performs slightly better on time. I saw the same result when I did a second CROSS JOIN for the data load and took TOP 1000000. Now, we are only talking a couple hundred ms, but it was consistent. Thoughts?

    Jared
    CE - Microsoft

  • If you are looking for performance, then "well-done" CLR using RegEx will outperform T-SQL for large and/or complex string replacements.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • SQLKnowItAll (8/14/2012)


    So I set up a quick test using this setup:

    USE test

    GO

    CREATE TABLE testStringReplace (string VARCHAR(250))

    INSERT INTO testStringReplace

    SELECT TOP 500000 c1.name + '%' + c1.name + '%'

    FROM sys.columns c1

    CROSS JOIN sys.columns c2

    GO

    --SET STATISTICS IO ON

    --SET STATISTICS TIME ON

    SELECT string, StringReplaced=STUFF(string, [First], 1+[Second], 'replaced')

    FROM testStringReplace

    CROSS APPLY (SELECT [First]=PATINDEX('%[%]%', string)) a

    CROSS APPLY (SELECT [Second]=PATINDEX('%[%]%', SUBSTRING(string, 1+[First], LEN(string)))) b

    SELECT string,

    REPLACE(string,

    SUBSTRING(string, CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1),

    'replaced')

    FROM testStringReplace

    SELECT string,

    STUFF(string, CHARINDEX('%', string, 1),

    CHARINDEX('%', string, CHARINDEX('%', string, 1) + 1) - CHARINDEX('%', string, 1) + 1, 'replaced')

    FROM testStringReplace

    GO

    The actual execution plans are exactly the same for each and IO is the same. Interestingly enough... Every time I run this, the third one performs slightly better on time. I saw the same result when I did a second CROSS JOIN for the data load and took TOP 1000000. Now, we are only talking a couple hundred ms, but it was consistent. Thoughts?

    I confess, I didn't really expect the PATINDEX version to outperform the others. Just thought I'd throw it out there in case it ended up more applicable to what you were doing. You might be able to squeak a little bit extra performance out of it by eliminating the cascading CROSS APPLYs. Meaning, embed the intermediate result calculations directly where they're needed. I wrote it that way for readability.

    Eugene's suggestion of using a CLR REGEX function may be faster (haven't tried) but here's a link: http://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Eugene Elutin (8/14/2012)


    If you are looking for performance, then "well-done" CLR using RegEx will outperform T-SQL for large and/or complex string replacements.

    I've found that's frequently not true even when "well-done".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply