# A quick query puzzle:

• Hi

I have a table with below data. Requirement is to replace all integers with continous 6 or more occurances with 'x'. Less than 6 occurances should not be replaced.

create table t1(name varchar (100))

GO

INsert into t1

GO

INsert into t1

select 'cbv736456XYZ543534534545XLS'

GO

EXPECTED RESULT:

cbvxxxxxxXYZxxxxxxxxxxxxXLS

drop table t1

Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

• Hi

You could try the following

`CREATE FUNCTION replace6plusint(@s1 varchar(100)) RETURNS TABLE AS RETURN`

`WITH`

`firstRun AS (`

`SELECT replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(@s1,'0','^'),'1','^'),'2','^'),'3','^'),'4','^'),'5','^'),'6','^'),'7','^'),'8','^'),'9','^'),'^^^^^^','------') a`

`UNION ALL`

`SELECT replace(a,'-^','--')`

`FROM firstRun`

`WHERE a like '%-^%'`

`),`

`secondRun AS (`

`SELECT replace(a,'-','x') b, charindex('^', a) p`

`FROM firstRun`

`WHERE a not like '%-^%'`

`UNION ALL`

`SELECT stuff(b, p, 1, substring(@s1,p,1)), charindex('^', b)`

`FROM secondRun`

`WHERE p <> 0`

`)`

`SELECT b result FROM secondRun WHERE p = 0`

• This is really good but problem is that my table has multiple rows and this is a table function. So it has to be a scalar function for me to use....

Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

• S_Kumar_S (10/5/2012)

This is really good but problem is that my table has multiple rows and this is a table function. So it has to be a scalar function for me to use....

Find some good reading material covering functions. You absolutely don't need a function for this - and most folks would recommend a table-valued function over the alternatives. It's far simpler than you think:

`SELECT name,`

`REPLACE(`

`REPLACE(`

`REPLACE(`

`REPLACE(`

`REPLACE(name,`

`'012345','xxxxxx'),`

`'123456','xxxxxx'),`

`'234567','xxxxxx'),`

`'345678','xxxxxx'),`

`'456789','xxxxxx')`

`FROM t1`

โWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ - Gail Shaw

Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

• Hi Chris.

the numbers are not consecutive as you have assumed. And they may be of length greater than 6. I think it is not so straight as it looks.

Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

• With a large amount of replaces this can be done.

Problem is that the number of replaces is very large.

So I'll show the concept but not the code, because the code would be large.

First of all I assum that there are 'strings' which do not occure in the supplied string.

Here I'll use the strings '^' and 'x', but if these occure replace then with strings which do not occure.

First replace alle the numbers in the strings with

1

• S_Kumar_S (10/5/2012)

Hi Chris.

the numbers are not consecutive as you have assumed. And they may be of length greater than 6. I think it is not so straight as it looks.

Ah, ok.

`SELECT`

`t.name, x.New_Name`

`FROM #t1 t`

`CROSS APPLY (`

`SELECT New_Name =`

`(SELECT Newletter + ''`

`FROM (`

`SELECT n, Newletter = CASE WHEN d.grouper IS NOT NULL`

`AND COUNT(*) OVER (PARTITION BY d.grouper) > 5 THEN 'x' ELSE letter END`

`FROM (`

`SELECT n, letter,`

`grouper = ROW_NUMBER() OVER(PARTITION BY t.name ORDER BY ISNUMERIC(letter), n) - N`

`FROM (SELECT TOP(LEN(t.name)) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))`

`FROM sys.columns a, sys.columns b) tally`

`CROSS APPLY (SELECT letter = SUBSTRING(t.name,n,1)) l`

`) d`

`) stri`

`ORDER BY n`

`FOR XML PATH('') , TYPE).value('.', 'varchar(max)')`

`) x`

`ORDER BY t.name`

โWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ - Gail Shaw

Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

• With a large amount of replaces this can be done.

Problem is that the number of replaces is very large.

So I'll show the concept but not the code, because the code would be large.

First of all I assume that there are 'strings' which do not occure in the supplied string.

Here I'll use the strings '^' and 'x', but if these occure replace then with strings which do not occure.

First replace alle the numbers in the strings with

1 becomes ^1 (do this for all 10 numeric characters)

^1^ becomes ^^1 (do this for all 10 numeric characters and repeat this large number of times).

All number strings have now the shape ^^^^1234

X^^^^^^ becomes x^^^^^

EDIT: ABOVE LINE SHOULD BE (sorry)

^^^^^^ becomes x^^^^^

x^ becomes xx (repeat a number of times)

Now the number strings all have the shape ^^^^1234 or xxxxxxxx12345678

x1 becomes x (do this for all 10 numeric characters and repeat this a large number of) times).

^ becomes '' (remove all the ^)

The number of replaces is large, but maybe could be done in a loops.

Or if the database is very large this can be generated.

If the database is large and the numbers are very long you need an awfull lot of replaces.

For large numbers and large number of rows this might not be a handy method.

An alternate solution might be to copy the table change all numbers into '1' do the above with far less substitutes and than replace the 1111 string with the original numbers. With some masking this is possible.

Ben Brugman

• This is quite impressive but it doesn't work for some scenarios. e.g. this one :

insert into t1

select 'AAA22222222222222BB5BB 33CCCCCCCCC342234234'

EDIT: I meant it for Chris response:

SELECT

t.name, x.New_Name

FROM #t1 t

CROSS APPLY (

SELECT New_Name =

(SELECT Newletter + ''

FROM (

SELECT n, Newletter = CASE WHEN d.grouper IS NOT NULL

AND COUNT(*) OVER (PARTITION BY d.grouper) > 5 THEN 'x' ELSE letter END

FROM (

SELECT n, letter,

grouper = ROW_NUMBER() OVER(PARTITION BY t.name ORDER BY ISNUMERIC(letter), n) - N

FROM (SELECT TOP(LEN(t.name)) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

FROM sys.columns a, sys.columns b) tally

CROSS APPLY (SELECT letter = SUBSTRING(t.name,n,1)) l

) d

) stri

ORDER BY n

FOR XML PATH('') , TYPE).value('.', 'varchar(max)')

) x

ORDER BY t.name

Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

• S_Kumar_S (10/5/2012)

This is quite impressive but it doesn't work for some scenarios. e.g. this one :

insert into t1

select 'AAA22222222222222BB5BB 33CCCCCCCCC342234234'

EDIT: I meant it for Chris response:

SELECT

t.name, x.New_Name

FROM #t1 t

CROSS APPLY (

SELECT New_Name =

(SELECT Newletter + ''

FROM (

SELECT n, Newletter = CASE WHEN d.grouper IS NOT NULL

AND COUNT(*) OVER (PARTITION BY d.grouper) > 5 THEN 'x' ELSE letter END

FROM (

SELECT n, letter,

grouper = ROW_NUMBER() OVER(PARTITION BY t.name ORDER BY ISNUMERIC(letter), n) - N

FROM (SELECT TOP(LEN(t.name)) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

FROM sys.columns a, sys.columns b) tally

CROSS APPLY (SELECT letter = SUBSTRING(t.name,n,1)) l

) d

) stri

ORDER BY n

FOR XML PATH('') , TYPE).value('.', 'varchar(max)')

) x

ORDER BY t.name

My apologies, it was a change during coding:

`SELECT`

`t.name, x.New_Name`

`FROM #t1 t`

`CROSS APPLY (`

`SELECT New_Name =`

`(SELECT Newletter + ''`

`FROM (`

`SELECT`

`n, Newletter = CASE WHEN ISNUMERIC(letter) = 1 -- d.grouper IS NOT NULL`

`AND COUNT(*) OVER (PARTITION BY d.grouper) > 5 THEN 'x' ELSE letter END`

`FROM (`

`SELECT t.name, n, letter,`

`grouper = n - ROW_NUMBER() OVER(ORDER BY ISNUMERIC(letter) desc, n)`

`FROM`

`(SELECT TOP(LEN(t.name)) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))`

`FROM sys.columns a, sys.columns b) tally`

`CROSS APPLY (SELECT letter = SUBSTRING(t.name,n,1)) l`

`) d`

`) stri`

`ORDER BY n`

`FOR XML PATH('') , TYPE).value('.', 'varchar(max)')`

`) x`

`ORDER BY t.name`

Edit: found an error in an extended sample data set.

โWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ - Gail Shaw

Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

• Hello all,

Because I goofed a bit in my previous solution here the worked out code.

It's a solution which only used simple replaces, not efficient (lots of updates).

ben brugman

`-- Ben Brugman`

`-- 20121005`

`-- http://www.sqlservercentral.com/Forums/Topic1368367-391-1.aspx?Update=1`

`-- A quick query puzzle.`

`-- This solution is a rather cumbersome solution only using a large number of 'simple' replaces.`

`-- It is also limited to the length defined in @maxlength (this is the maximum length a number can be).`

`-- Create the table`

`create table #t1(name varchar (100))`

`GO`

`INsert into #t1`

`select '1234ABC123456XYZ1234567890ADS'`

`GO`

`INsert into #t1`

`select 'cbv736456XYZ543534534545XLS'`

`GO`

`-- Create a replace stored procedure.`

`Create Procedure st_replace`

`@string1 varchar(8000),`

`@string2 varchar(8000)`

`as`

`begin`

` Update #t1 set name = replace(name,@string1,@string2)`

`end`

`go`

`declare @maxlength int = 40`

`declare @tel int`

`-- Mark all the numeric characters.`

`exec st_replace '0', '^0'`

`exec st_replace '1', '^1'`

`exec st_replace '2', '^2'`

`exec st_replace '3', '^3'`

`exec st_replace '4', '^4'`

`exec st_replace '5', '^5'`

`exec st_replace '6', '^6'`

`exec st_replace '7', '^7'`

`exec st_replace '8', '^8'`

`exec st_replace '9', '^9'`

`-- Bring the marks in front of the numbers`

`set @tel = @maxlength`

`WHILE @tel > 0 BEGIN`

` exec st_replace '0^','^0'`

` exec st_replace '1^','^1'`

` exec st_replace '2^','^2'`

` exec st_replace '3^','^3'`

` exec st_replace '4^','^4'`

` exec st_replace '5^','^5'`

` exec st_replace '6^','^6'`

` exec st_replace '7^','^7'`

` exec st_replace '8^','^8'`

` exec st_replace '9^','^9'`

` set @tel = @tel - 1`

`END`

`-- Change the marks which are at least 6 long`

`exec st_replace '^^^^^^','X^^^^^'`

`set @tel = @maxlength`

`WHILE @tel > 0 BEGIN`

` exec st_replace 'x^','xx'`

` set @tel = @tel - 1`

`END`

`--`

`-- Numbers are now of the form ^^^^1234 or XXXXXXXX12345678`

`--`

`-- Remove the numbers behind the X mark.`

`set @tel = @maxlength`

`WHILE @tel > 0 BEGIN`

` exec st_replace 'X0','X'`

` exec st_replace 'X1','X'`

` exec st_replace 'X2','X'`

` exec st_replace 'X3','X'`

` exec st_replace 'X4','X'`

` exec st_replace 'X5','X'`

` exec st_replace 'X6','X'`

` exec st_replace 'X7','X'`

` exec st_replace 'X8','X'`

` exec st_replace 'X9','X'`

` set @tel = @tel - 1`

`END`

`-- Remove the 'temporary' marks`

`exec st_replace '^',''`

`SELECT * FROM #T1`

`DROP PROCEDURE ST_REPLACE`

`DROP TABLE #T1`

• Hi

Here's another variation that should do the trick. No function this time, just a query

`create table #t1(name varchar (100))`

`GO`

`INsert into #t1`

`select '1234ABC123456XYZ1234567890ADS'`

`GO`

`INsert into #t1`

`select 'cbv736456XYZ543534534545XLS'`

`GO`

`INsert into #t1`

`select 'cbv736456XYZ543534534545XLS2134488'`

`GO`

`;with cte AS (`

`SELECT`

`stuff(name ,`

`patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition`

`patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)`

`replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's`

`) res,`

`1 mycount,`

`row_number() over (order by name) grouper`

`FROM #t1`

`UNION ALL`

`SELECT`

`stuff(res ,`

`patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition`

`patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length`

`replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's`

`) res,`

`mycount + 1,`

`grouper`

`FROM cte`

`WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0`

`),`

`maxcte as (select grouper, max(mycount) lastres from cte group by grouper)`

`SELECT res`

`FROM cte c`

`inner join maxcte m on mycount = lastres and c.grouper = m.grouper`

• A small change See in the code.

(My reason for this change change is: The essential part of the code is only written once, so it is smaller, clearer and with less riscs for errors).

Thanks for submitting this code, I am learning from this.

Ben Brugman

mickyT (10/8/2012)

Hi

Here's another variation that should do the trick. No function this time, just a query

`create table #t1(name varchar (100))`

`GO`

`INsert into #t1`

`select '1234ABC123456XYZ1234567890ADS'`

`GO`

`INsert into #t1`

`select 'cbv736456XYZ543534534545XLS'`

`GO`

`INsert into #t1`

`select 'cbv736456XYZ543534534545XLS2134488'`

`GO`

`;with cte AS (`

`SELECT`

`--stuff(name ,`

`--patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition`

`--patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)`

`--replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's`

`--) res,`

` convert(varchar(max), res) as res,`

`1 mycount,`

`row_number() over (order by name) grouper`

`FROM #t1`

`UNION ALL`

`SELECT`

`stuff(res ,`

`patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition`

`patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length`

`replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's`

`) res,`

`mycount + 1,`

`grouper`

`FROM cte`

`WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0`

`),`

`maxcte as (select grouper, max(mycount) lastres from cte group by grouper)`

`SELECT res`

`FROM cte c`

`inner join maxcte m on mycount = lastres and c.grouper = m.grouper`

• ben.brugman (10/8/2012)

A small change See in the code.

(My reason for this change change is: The essential part of the code is only written once, so it is smaller, clearer and with less riscs for errors).

Thanks for submitting this code, I am learning from this.

Ben Brugman

I agree, that change makes it nicer. Cheers:-)

Normally I would have used a set of CLR functions that I have that replicate the oracle regular expression functions.

• Just for fun, here's a nasty piece of work:

`create table #t1(name varchar (100))`

`INsert into #t1`

`select '1234ABC123456XYZ1234567890ADS'`

`INsert into #t1`

`select 'cbv736456XYZ543534534545XLS'`

`;WITH ChunkIt (RowID, n, str1, str2, str3) AS (`

` SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), n=1`

` ,CASE WHEN dig < alp THEN SUBSTRING(name, 1, alp-1) ELSE SUBSTRING(name, 1, dig-1) END`

` ,CASE WHEN dig < alp THEN SUBSTRING(name, alp, LEN(name)) ELSE SUBSTRING(name, dig, LEN(name)) END`

` ,CAST('' AS VARCHAR(100))`

` FROM #t1`

` CROSS APPLY (SELECT PATINDEX('%[0-9]%', name), PATINDEX('%[A-Za-z]%', name)) a(dig, alp)`

` UNION ALL`

` SELECT RowID, n+1`

` ,b.str1`

` ,b.str2`

` ,CASE WHEN PATINDEX('%[0-9]%', b.str2) = 0 OR PATINDEX('%[A-Za-z]%', b.str2) = 0 THEN b.str2 ELSE '' END`

` FROM ChunkIt`

` CROSS APPLY (SELECT PATINDEX('%[0-9]%', str2), PATINDEX('%[A-Za-z]%', str2)) a(dig, alp)`

` CROSS APPLY (`

` SELECT CASE WHEN a.dig < a.alp THEN SUBSTRING(str2, 1, a.alp-1) ELSE SUBSTRING(str2, 1, a.dig-1) END`

` ,CASE WHEN a.dig < a.alp THEN SUBSTRING(str2, a.alp, LEN(str2)) ELSE SUBSTRING(str2, a.dig, LEN(str2)) END`

` ) b(str1, str2)`

` WHERE a.dig > 0 AND a.alp > 0`

` ),`

` ForGrouping AS (`

` SELECT RowID, n`

` ,str1=CASE WHEN LEN(str1) >= 6 AND PATINDEX('%[0-9]%', str1) > 0 THEN REPLICATE('x', LEN(str1)) ELSE str1 END`

` ,str2=CASE WHEN LEN(str3) >= 6 AND PATINDEX('%[0-9]%', str3) > 0 THEN REPLICATE('x', LEN(str3)) ELSE str3 END`

` FROM ChunkIt`

` )`

`SELECT name=(`

` SELECT str1 + str2`

` FROM ForGrouping b`

` WHERE a.RowID = b.RowID`

` FOR XML PATH(''))`

`FROM ForGrouping a`

`GROUP BY RowID`

`ORDER BY RowID`

`DROP TABLE #t1`

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

My thought question: Have you ever been told that your query runs too fast?