regarding replace function

Question

regarding replace function

mpradeep23

SSCommitted

Points: 1868
More actions
February 13, 2014 at 9:48 am

#289533

hi all
i have column in table which contains tabs and " i want replace with space
i am using repalce function is thier other way to improve performance with out using replace function
thanks,
pradeep

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Luis Cazares SSC Guru Points: 183698 More actions · Answer 1

Why do you believe that replace is causing performance problems?

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

mpradeep23 SSCommitted Points: 1868 More actions · Answer 2

hi

i am thinking other then replace which can be to improve queery performance

thanks,

pradeep

SQLRNNR SSC Guru Points: 281334 More actions · Answer 3

with just a couple of characters replace works quite nicely.

Another option that people use from time to time is a function that loops through the string to replace characters. That can work quickly and can be very slow.

Another option is to build a CLR regex to replace those characters.

Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 4

The biggest problem I have seen at clients along this type of need is not having a WHERE clause so you only update rows that actually HAVE at least one tab character in them. Updating 7.3M rows of fat data when only 937 of them actually have one or more tabs is a MASSIVE waste of many types of resources.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

sqldriver SSChampion Points: 10440 More actions · Answer 5

TheSQLGuru (2/13/2014)
The biggest problem I have seen at clients along this type of need is not having a WHERE clause so you only update rows that actually HAVE at least one tab character in them. Updating 7.3M rows of fat data when only 937 of them actually have one or more tabs is a MASSIVE waste of many types of resources.

Is filtering in the WHERE faster than filtering with a CASE statement? Most of the time when this has to be done there's more than one wonky character to replace, so it's either multiple CASE or WHERE with multiple ORs. Do they scale much differently?

Thanks!

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

sqldriver (2/14/2014)
TheSQLGuru (2/13/2014)
The biggest problem I have seen at clients along this type of need is not having a WHERE clause so you only update rows that actually HAVE at least one tab character in them. Updating 7.3M rows of fat data when only 937 of them actually have one or more tabs is a MASSIVE waste of many types of resources.
Is filtering in the WHERE faster than filtering with a CASE statement? Most of the time when this has to be done there's more than one wonky character to replace, so it's either multiple CASE or WHERE with multiple ORs. Do they scale much differently?
Thanks!

I think we may be talking apples and pomegranates here, or maybe this head cold is worse than I am thinking it is! 🙂

Please give me example code to demonstrate each scenario of whatever it is you are trying to say above.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

sqldriver SSChampion Points: 10440 More actions · Answer 7

TheSQLGuru (2/14/2014)
sqldriver (2/14/2014)
TheSQLGuru (2/13/2014)
The biggest problem I have seen at clients along this type of need is not having a WHERE clause so you only update rows that actually HAVE at least one tab character in them. Updating 7.3M rows of fat data when only 937 of them actually have one or more tabs is a MASSIVE waste of many types of resources.
Is filtering in the WHERE faster than filtering with a CASE statement? Most of the time when this has to be done there's more than one wonky character to replace, so it's either multiple CASE or WHERE with multiple ORs. Do they scale much differently?
Thanks!
I think we may be talking apples and pomegranates here, or maybe this head cold is worse than I am thinking it is! 🙂
Please give me example code to demonstrate each scenario of whatever it is you are trying to say above.

Sure, this is what I was thinking, simplified a bit. In one you're filtering in the WHERE, in another you're using CASE to filter what's getting replaced. Does that make sense? I know it makes a difference in what gets displayed when you use SELECT, but updating might be different?

;WITH FeelBetterKevin (product) AS (

SELECT 'NyQuil' UNION ALL

SELECT 'Sudafed' UNION ALL

SELECT 'Mucinex D' UNION ALL

SELECT 'Afrin' UNION ALL

SELECT 'DayQuil' UNION ALL

SELECT 'Aleve D' UNION ALL

SELECT 'Whiskey'

)

SELECT replace(product, 'Q','QQ')

FROM feelbetterkevin

WHERE product like '%Q%'

;WITH FeelBetterKevin (product) AS (

SELECT 'NyQuil' UNION ALL

SELECT 'Sudafed' UNION ALL

SELECT 'Mucinex D' UNION ALL

SELECT 'Afrin' UNION ALL

SELECT 'DayQuil' UNION ALL

SELECT 'Aleve D' UNION ALL

SELECT 'Whiskey'

)

SELECT CASE WHEN product LIKE '%Q%' THEN replace(product, 'Q','QQ') ELSE product END AS product

FROM feelbetterkevin

And, in regards to scaling, I meant that if you additionally wanted to replace X with XX, it would take another CASE statement, or a WHERE ... OR... to filter. I seem to recall a Grant Rant about WHERE ... OR ... making the optimizer not use indexes properly.

I understand that using '%Q%' will already defeat any indexing, being all non-SARGable and whatnot.

Thanks!

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 8

sqldriver (2/14/2014)
TheSQLGuru (2/14/2014)
sqldriver (2/14/2014)
TheSQLGuru (2/13/2014)
The biggest problem I have seen at clients along this type of need is not having a WHERE clause so you only update rows that actually HAVE at least one tab character in them. Updating 7.3M rows of fat data when only 937 of them actually have one or more tabs is a MASSIVE waste of many types of resources.
Is filtering in the WHERE faster than filtering with a CASE statement? Most of the time when this has to be done there's more than one wonky character to replace, so it's either multiple CASE or WHERE with multiple ORs. Do they scale much differently?
Thanks!
I think we may be talking apples and pomegranates here, or maybe this head cold is worse than I am thinking it is! 🙂
Please give me example code to demonstrate each scenario of whatever it is you are trying to say above.
Sure, this is what I was thinking, simplified a bit. In one you're filtering in the WHERE, in another you're using CASE to filter what's getting replaced. Does that make sense? I know it makes a difference in what gets displayed when you use SELECT, but updating might be different?
;WITH FeelBetterKevin (product) AS (
SELECT 'NyQuil' UNION ALL
SELECT 'Sudafed' UNION ALL
SELECT 'Mucinex D' UNION ALL
SELECT 'Afrin' UNION ALL
SELECT 'DayQuil' UNION ALL
SELECT 'Aleve D' UNION ALL
SELECT 'Whiskey'
)
SELECT replace(product, 'Q','QQ')
FROM feelbetterkevin
WHERE product like '%Q%'
;WITH FeelBetterKevin (product) AS (
SELECT 'NyQuil' UNION ALL
SELECT 'Sudafed' UNION ALL
SELECT 'Mucinex D' UNION ALL
SELECT 'Afrin' UNION ALL
SELECT 'DayQuil' UNION ALL
SELECT 'Aleve D' UNION ALL
SELECT 'Whiskey'
)
SELECT CASE WHEN product LIKE '%Q%' THEN replace(product, 'Q','QQ') ELSE product END AS product
FROM feelbetterkevin
And, in regards to scaling, I meant that if you additionally wanted to replace X with XX, it would take another CASE statement, or a WHERE ... OR... to filter. I seem to recall a Grant Rant about WHERE ... OR ... making the optimizer not use indexes properly.
I understand that using '%Q%' will already defeat any indexing, being all non-SARGable and whatnot.
Thanks!

I LOVE your example - made me laugh outloud!! The issue with your sample code though is that you aren't actually doing an UPDATE like I understood your OP to be about. Here is your sample modified to do just that:

CREATE TABLE #temp1 (med varchar(20) NOT NULL)

INSERT #temp1 (med)

SELECT 'NyQuil' UNION ALL

SELECT 'Sudafed' UNION ALL

SELECT 'Mucinex D' UNION ALL

SELECT 'Afrin' UNION ALL

SELECT 'DayQuil' UNION ALL

SELECT 'Aleve D' UNION ALL

SELECT 'Whiskey'

(7 row(s) affected)

UPDATE #temp1

SET med = replace(med, 'Q','QQ')

WHERE med like '%Q%'

(2 row(s) affected) <-- THIS IS GOOD!!

SELECT * FROM #temp1

---------------

CREATE TABLE #temp2 (med varchar(20) NOT NULL)

INSERT #temp2 (med)

SELECT 'NyQuil' UNION ALL

SELECT 'Sudafed' UNION ALL

SELECT 'Mucinex D' UNION ALL

SELECT 'Afrin' UNION ALL

SELECT 'DayQuil' UNION ALL

SELECT 'Aleve D' UNION ALL

SELECT 'Whiskey'

(7 row(s) affected)

UPDATE #temp2

SET med = CASE WHEN med LIKE '%Q%' THEN replace(med, 'Q','QQ') ELSE med END

(7 row(s) affected) <--THIS IS BAD!!

SELECT * FROM #temp2

As you can see, the first example only UPDATES TWO ROWS because of the WHERE clause. The second one UPDATES ALL SEVEN ROWS because there is no WHERE clause, although it only MODIFIES THE DATA of two of those rows as you can see from the output. This is the exact situation I was describing, and I see it far too often. And when it is 1M or 100M rows it can REALLY hurt!!

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

sqldriver SSChampion Points: 10440 More actions · Answer 9

TheSQLGuru (2/15/2014)
sqldriver (2/14/2014)
TheSQLGuru (2/14/2014)
sqldriver (2/14/2014)
TheSQLGuru (2/13/2014)
The biggest problem I have seen at clients along this type of need is not having a WHERE clause so you only update rows that actually HAVE at least one tab character in them. Updating 7.3M rows of fat data when only 937 of them actually have one or more tabs is a MASSIVE waste of many types of resources.
Is filtering in the WHERE faster than filtering with a CASE statement? Most of the time when this has to be done there's more than one wonky character to replace, so it's either multiple CASE or WHERE with multiple ORs. Do they scale much differently?
Thanks!
I think we may be talking apples and pomegranates here, or maybe this head cold is worse than I am thinking it is! 🙂
Please give me example code to demonstrate each scenario of whatever it is you are trying to say above.
Sure, this is what I was thinking, simplified a bit. In one you're filtering in the WHERE, in another you're using CASE to filter what's getting replaced. Does that make sense? I know it makes a difference in what gets displayed when you use SELECT, but updating might be different?
;WITH FeelBetterKevin (product) AS (
SELECT 'NyQuil' UNION ALL
SELECT 'Sudafed' UNION ALL
SELECT 'Mucinex D' UNION ALL
SELECT 'Afrin' UNION ALL
SELECT 'DayQuil' UNION ALL
SELECT 'Aleve D' UNION ALL
SELECT 'Whiskey'
)
SELECT replace(product, 'Q','QQ')
FROM feelbetterkevin
WHERE product like '%Q%'
;WITH FeelBetterKevin (product) AS (
SELECT 'NyQuil' UNION ALL
SELECT 'Sudafed' UNION ALL
SELECT 'Mucinex D' UNION ALL
SELECT 'Afrin' UNION ALL
SELECT 'DayQuil' UNION ALL
SELECT 'Aleve D' UNION ALL
SELECT 'Whiskey'
)
SELECT CASE WHEN product LIKE '%Q%' THEN replace(product, 'Q','QQ') ELSE product END AS product
FROM feelbetterkevin
And, in regards to scaling, I meant that if you additionally wanted to replace X with XX, it would take another CASE statement, or a WHERE ... OR... to filter. I seem to recall a Grant Rant about WHERE ... OR ... making the optimizer not use indexes properly.
I understand that using '%Q%' will already defeat any indexing, being all non-SARGable and whatnot.
Thanks!
I LOVE your example - made me laugh outloud!! The issue with your sample code though is that you aren't actually doing an UPDATE like I understood your OP to be about. Here is your sample modified to do just that:
CREATE TABLE #temp1 (med varchar(20) NOT NULL)
INSERT #temp1 (med)
SELECT 'NyQuil' UNION ALL
SELECT 'Sudafed' UNION ALL
SELECT 'Mucinex D' UNION ALL
SELECT 'Afrin' UNION ALL
SELECT 'DayQuil' UNION ALL
SELECT 'Aleve D' UNION ALL
SELECT 'Whiskey'
(7 row(s) affected)
UPDATE #temp1
SET med = replace(med, 'Q','QQ')
WHERE med like '%Q%'
(2 row(s) affected) <-- THIS IS GOOD!!
SELECT * FROM #temp1
---------------
CREATE TABLE #temp2 (med varchar(20) NOT NULL)
INSERT #temp2 (med)
SELECT 'NyQuil' UNION ALL
SELECT 'Sudafed' UNION ALL
SELECT 'Mucinex D' UNION ALL
SELECT 'Afrin' UNION ALL
SELECT 'DayQuil' UNION ALL
SELECT 'Aleve D' UNION ALL
SELECT 'Whiskey'
(7 row(s) affected)
UPDATE #temp2
SET med = CASE WHEN med LIKE '%Q%' THEN replace(med, 'Q','QQ') ELSE med END
(7 row(s) affected) <--THIS IS BAD!!
SELECT * FROM #temp2
As you can see, the first example only UPDATES TWO ROWS because of the WHERE clause. The second one UPDATES ALL SEVEN ROWS because there is no WHERE clause, although it only MODIFIES THE DATA of two of those rows as you can see from the output. This is the exact situation I was describing, and I see it far too often. And when it is 1M or 100M rows it can REALLY hurt!!

Ah. I was thinking about the slide from your common T-SQL mistakes deck about predicate evaluation order (which I didn't have at work, where I probably need it most. heh!) and couldn't recall if this sort of thing was addressed. Thanks.

Dwain Camps SSC Guru Points: 86908 More actions · Answer 10

If you really need to eke every last erg of performance out of a replace, consider using a binary collation:

https://connect.microsoft.com/SQLServer/feedback/details/512459/replace-function-extremely-slow-with-non-binary-windows-collation

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 11

...erg...

Now THERE'S a word I haven't heard since I completed my BS in Physics!! Good one!:hehe:

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Dwain Camps SSC Guru Points: 86908 More actions · Answer 12

TheSQLGuru (2/16/2014)
...erg...
Now THERE'S a word I haven't heard since I completed my BS in Physics!! Good one!:hehe:

I've got a million of 'em.

THE SEX LIFE OF AN ELECTRON (with unhappy ending)

One night when his charge was at full capacity, Micro Farad decided to

get a cute little coil to discharge him. He picked up Millie Amp and

took her for a ride on his megacycle. They rode across the wheat stone

bridge, around the sine wave, and into the magnetic field next to the

flowing current.

Micro Farad, attracted by Millie's characteristic curve, soon had her

field fully excited. He laid her on the ground potential, raised her

frequency, lowered her resistance, and pulled out his high voltage

probe. He inserted it in parallel and began to short circuit her shunt.

Fully excited, Millie cried out, "ohm, ohm, give me mho". With his tube

at maximum output and her coil vibrating from the current flow, her

shunt soon reached maximum heat. The excessive current had shorted her

shunt, and Micro's capacity was rapidly discharged, and every electron

was drained off. They fluxed all night, tried various connections and

hookings until his bar magnet had lost all of its strength, and he could

no longer generate enough voltage to sustain his collapsing field. With

his battery fully discharged, Micro was unable to excite his tickler, so

they ended up reversing polarity and blowing each other's fuses.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

That's one of my favorites!! 😀

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service