A quick query puzzle:

  • S_Kumar_S

    SSCrazy Eights

    Points: 9004

    Hi

    I have a table with below data. Requirement is to replace all integers with continous 6 or more occurances with 'x'. Less than 6 occurances should not be replaced.

    create table t1(name varchar (100))

    GO

    INsert into t1

    select '1234ABC123456XYZ1234567890ADS'

    GO

    INsert into t1

    select 'cbv736456XYZ543534534545XLS'

    GO

    EXPECTED RESULT:

    1234ABCxxxxxxXYZxxxxxxxxxxADS

    cbvxxxxxxXYZxxxxxxxxxxxxXLS

    drop table t1

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • mickyT

    SSChampion

    Points: 10360

    Hi

    You could try the following

    CREATE FUNCTION replace6plusint(@s1 varchar(100)) RETURNS TABLE AS RETURN

    WITH

    firstRun AS (

    SELECT replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(@s1,'0','^'),'1','^'),'2','^'),'3','^'),'4','^'),'5','^'),'6','^'),'7','^'),'8','^'),'9','^'),'^^^^^^','------') a

    UNION ALL

    SELECT replace(a,'-^','--')

    FROM firstRun

    WHERE a like '%-^%'

    ),

    secondRun AS (

    SELECT replace(a,'-','x') b, charindex('^', a) p

    FROM firstRun

    WHERE a not like '%-^%'

    UNION ALL

    SELECT stuff(b, p, 1, substring(@s1,p,1)), charindex('^', b)

    FROM secondRun

    WHERE p <> 0

    )

    SELECT b result FROM secondRun WHERE p = 0

  • S_Kumar_S

    SSCrazy Eights

    Points: 9004

    This is really good but problem is that my table has multiple rows and this is a table function. So it has to be a scalar function for me to use....

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • ChrisM@Work

    SSC Guru

    Points: 186094

    S_Kumar_S (10/5/2012)


    This is really good but problem is that my table has multiple rows and this is a table function. So it has to be a scalar function for me to use....

    Find some good reading material covering functions. You absolutely don't need a function for this - and most folks would recommend a table-valued function over the alternatives. It's far simpler than you think:

    SELECT name,

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(

    REPLACE(name,

    '012345','xxxxxx'),

    '123456','xxxxxx'),

    '234567','xxxxxx'),

    '345678','xxxxxx'),

    '456789','xxxxxx')

    FROM t1

    [font="Arial"]“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw[/font]


    For fast, accurate and documented assistance in answering your questions, please read this article[/url].
    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]
    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
    [url

  • S_Kumar_S

    SSCrazy Eights

    Points: 9004

    Hi Chris.

    the numbers are not consecutive as you have assumed. And they may be of length greater than 6. I think it is not so straight as it looks.

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • ben.brugman

    SSChampion

    Points: 13350

    With a large amount of replaces this can be done.

    Problem is that the number of replaces is very large.

    So I'll show the concept but not the code, because the code would be large.

    First of all I assum that there are 'strings' which do not occure in the supplied string.

    Here I'll use the strings '^' and 'x', but if these occure replace then with strings which do not occure.

    First replace alle the numbers in the strings with

    1

  • ChrisM@Work

    SSC Guru

    Points: 186094

    S_Kumar_S (10/5/2012)


    Hi Chris.

    the numbers are not consecutive as you have assumed. And they may be of length greater than 6. I think it is not so straight as it looks.

    Ah, ok.

    SELECT

    t.name, x.New_Name

    FROM #t1 t

    CROSS APPLY (

    SELECT New_Name =

    (SELECT Newletter + ''

    FROM (

    SELECT n, Newletter = CASE WHEN d.grouper IS NOT NULL

    AND COUNT(*) OVER (PARTITION BY d.grouper) > 5 THEN 'x' ELSE letter END

    FROM (

    SELECT n, letter,

    grouper = ROW_NUMBER() OVER(PARTITION BY t.name ORDER BY ISNUMERIC(letter), n) - N

    FROM (SELECT TOP(LEN(t.name)) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM sys.columns a, sys.columns b) tally

    CROSS APPLY (SELECT letter = SUBSTRING(t.name,n,1)) l

    ) d

    ) stri

    ORDER BY n

    FOR XML PATH('') , TYPE).value('.', 'varchar(max)')

    ) x

    ORDER BY t.name

    [font="Arial"]“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw[/font]


    For fast, accurate and documented assistance in answering your questions, please read this article[/url].
    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]
    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
    [url

  • ben.brugman

    SSChampion

    Points: 13350

    With a large amount of replaces this can be done.

    Problem is that the number of replaces is very large.

    So I'll show the concept but not the code, because the code would be large.

    First of all I assume that there are 'strings' which do not occure in the supplied string.

    Here I'll use the strings '^' and 'x', but if these occure replace then with strings which do not occure.

    First replace alle the numbers in the strings with

    1 becomes ^1 (do this for all 10 numeric characters)

    ^1^ becomes ^^1 (do this for all 10 numeric characters and repeat this large number of times).

    All number strings have now the shape ^^^^1234

    X^^^^^^ becomes x^^^^^

    EDIT: ABOVE LINE SHOULD BE (sorry)

    ^^^^^^ becomes x^^^^^

    x^ becomes xx (repeat a number of times)

    Now the number strings all have the shape ^^^^1234 or xxxxxxxx12345678

    x1 becomes x (do this for all 10 numeric characters and repeat this a large number of) times).

    ^ becomes '' (remove all the ^)

    The number of replaces is large, but maybe could be done in a loops.

    Or if the database is very large this can be generated.

    If the database is large and the numbers are very long you need an awfull lot of replaces.

    For large numbers and large number of rows this might not be a handy method.

    An alternate solution might be to copy the table change all numbers into '1' do the above with far less substitutes and than replace the 1111 string with the original numbers. With some masking this is possible.

    thanks for sharing your problem,

    Ben Brugman

  • S_Kumar_S

    SSCrazy Eights

    Points: 9004

    This is quite impressive but it doesn't work for some scenarios. e.g. this one :

    insert into t1

    select 'AAA22222222222222BB5BB 33CCCCCCCCC342234234'

    EDIT: I meant it for Chris response:

    SELECT

    t.name, x.New_Name

    FROM #t1 t

    CROSS APPLY (

    SELECT New_Name =

    (SELECT Newletter + ''

    FROM (

    SELECT n, Newletter = CASE WHEN d.grouper IS NOT NULL

    AND COUNT(*) OVER (PARTITION BY d.grouper) > 5 THEN 'x' ELSE letter END

    FROM (

    SELECT n, letter,

    grouper = ROW_NUMBER() OVER(PARTITION BY t.name ORDER BY ISNUMERIC(letter), n) - N

    FROM (SELECT TOP(LEN(t.name)) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM sys.columns a, sys.columns b) tally

    CROSS APPLY (SELECT letter = SUBSTRING(t.name,n,1)) l

    ) d

    ) stri

    ORDER BY n

    FOR XML PATH('') , TYPE).value('.', 'varchar(max)')

    ) x

    ORDER BY t.name

    Online Trainer For SQL DBA and Developer @RedBushTechnologies with 18 yrs exp.

  • ChrisM@Work

    SSC Guru

    Points: 186094

    S_Kumar_S (10/5/2012)


    This is quite impressive but it doesn't work for some scenarios. e.g. this one :

    insert into t1

    select 'AAA22222222222222BB5BB 33CCCCCCCCC342234234'

    EDIT: I meant it for Chris response:

    SELECT

    t.name, x.New_Name

    FROM #t1 t

    CROSS APPLY (

    SELECT New_Name =

    (SELECT Newletter + ''

    FROM (

    SELECT n, Newletter = CASE WHEN d.grouper IS NOT NULL

    AND COUNT(*) OVER (PARTITION BY d.grouper) > 5 THEN 'x' ELSE letter END

    FROM (

    SELECT n, letter,

    grouper = ROW_NUMBER() OVER(PARTITION BY t.name ORDER BY ISNUMERIC(letter), n) - N

    FROM (SELECT TOP(LEN(t.name)) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM sys.columns a, sys.columns b) tally

    CROSS APPLY (SELECT letter = SUBSTRING(t.name,n,1)) l

    ) d

    ) stri

    ORDER BY n

    FOR XML PATH('') , TYPE).value('.', 'varchar(max)')

    ) x

    ORDER BY t.name

    My apologies, it was a change during coding:

    SELECT

    t.name, x.New_Name

    FROM #t1 t

    CROSS APPLY (

    SELECT New_Name =

    (SELECT Newletter + ''

    FROM (

    SELECT

    n, Newletter = CASE WHEN ISNUMERIC(letter) = 1 -- d.grouper IS NOT NULL

    AND COUNT(*) OVER (PARTITION BY d.grouper) > 5 THEN 'x' ELSE letter END

    FROM (

    SELECT t.name, n, letter,

    grouper = n - ROW_NUMBER() OVER(ORDER BY ISNUMERIC(letter) desc, n)

    FROM

    (SELECT TOP(LEN(t.name)) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))

    FROM sys.columns a, sys.columns b) tally

    CROSS APPLY (SELECT letter = SUBSTRING(t.name,n,1)) l

    ) d

    ) stri

    ORDER BY n

    FOR XML PATH('') , TYPE).value('.', 'varchar(max)')

    ) x

    ORDER BY t.name

    Edit: found an error in an extended sample data set.

    [font="Arial"]“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw[/font]


    For fast, accurate and documented assistance in answering your questions, please read this article[/url].
    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]
    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
    [url

  • ben.brugman

    SSChampion

    Points: 13350

    Hello all,

    Because I goofed a bit in my previous solution here the worked out code.

    It's a solution which only used simple replaces, not efficient (lots of updates).

    ben brugman

    -- Ben Brugman

    -- 20121005

    -- http://www.sqlservercentral.com/Forums/Topic1368367-391-1.aspx?Update=1

    -- A quick query puzzle.

    -- This solution is a rather cumbersome solution only using a large number of 'simple' replaces.

    -- It is also limited to the length defined in @maxlength (this is the maximum length a number can be).

    -- Create the table

    create table #t1(name varchar (100))

    GO

    INsert into #t1

    select '1234ABC123456XYZ1234567890ADS'

    GO

    INsert into #t1

    select 'cbv736456XYZ543534534545XLS'

    GO

    -- Create a replace stored procedure.

    Create Procedure st_replace

    @string1 varchar(8000),

    @string2 varchar(8000)

    as

    begin

    Update #t1 set name = replace(name,@string1,@string2)

    end

    go

    declare @maxlength int = 40

    declare @tel int

    -- Mark all the numeric characters.

    exec st_replace '0', '^0'

    exec st_replace '1', '^1'

    exec st_replace '2', '^2'

    exec st_replace '3', '^3'

    exec st_replace '4', '^4'

    exec st_replace '5', '^5'

    exec st_replace '6', '^6'

    exec st_replace '7', '^7'

    exec st_replace '8', '^8'

    exec st_replace '9', '^9'

    -- Bring the marks in front of the numbers

    set @tel = @maxlength

    WHILE @tel > 0 BEGIN

    exec st_replace '0^','^0'

    exec st_replace '1^','^1'

    exec st_replace '2^','^2'

    exec st_replace '3^','^3'

    exec st_replace '4^','^4'

    exec st_replace '5^','^5'

    exec st_replace '6^','^6'

    exec st_replace '7^','^7'

    exec st_replace '8^','^8'

    exec st_replace '9^','^9'

    set @tel = @tel - 1

    END

    -- Change the marks which are at least 6 long

    exec st_replace '^^^^^^','X^^^^^'

    set @tel = @maxlength

    WHILE @tel > 0 BEGIN

    exec st_replace 'x^','xx'

    set @tel = @tel - 1

    END

    --

    -- Numbers are now of the form ^^^^1234 or XXXXXXXX12345678

    --

    -- Remove the numbers behind the X mark.

    set @tel = @maxlength

    WHILE @tel > 0 BEGIN

    exec st_replace 'X0','X'

    exec st_replace 'X1','X'

    exec st_replace 'X2','X'

    exec st_replace 'X3','X'

    exec st_replace 'X4','X'

    exec st_replace 'X5','X'

    exec st_replace 'X6','X'

    exec st_replace 'X7','X'

    exec st_replace 'X8','X'

    exec st_replace 'X9','X'

    set @tel = @tel - 1

    END

    -- Remove the 'temporary' marks

    exec st_replace '^',''

    SELECT * FROM #T1

    DROP PROCEDURE ST_REPLACE

    DROP TABLE #T1

  • mickyT

    SSChampion

    Points: 10360

    Hi

    Here's another variation that should do the trick. No function this time, just a query

    create table #t1(name varchar (100))

    GO

    INsert into #t1

    select '1234ABC123456XYZ1234567890ADS'

    GO

    INsert into #t1

    select 'cbv736456XYZ543534534545XLS'

    GO

    INsert into #t1

    select 'cbv736456XYZ543534534545XLS2134488'

    GO

    ;with cte AS (

    SELECT

    stuff(name ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition

    patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)

    replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's

    ) res,

    1 mycount,

    row_number() over (order by name) grouper

    FROM #t1

    UNION ALL

    SELECT

    stuff(res ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition

    patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length

    replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's

    ) res,

    mycount + 1,

    grouper

    FROM cte

    WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0

    ),

    maxcte as (select grouper, max(mycount) lastres from cte group by grouper)

    SELECT res

    FROM cte c

    inner join maxcte m on mycount = lastres and c.grouper = m.grouper

  • ben.brugman

    SSChampion

    Points: 13350

    A small change See in the code.

    (My reason for this change change is: The essential part of the code is only written once, so it is smaller, clearer and with less riscs for errors).

    Thanks for submitting this code, I am learning from this.

    Ben Brugman

    mickyT (10/8/2012)


    Hi

    Here's another variation that should do the trick. No function this time, just a query

    create table #t1(name varchar (100))

    GO

    INsert into #t1

    select '1234ABC123456XYZ1234567890ADS'

    GO

    INsert into #t1

    select 'cbv736456XYZ543534534545XLS'

    GO

    INsert into #t1

    select 'cbv736456XYZ543534534545XLS2134488'

    GO

    ;with cte AS (

    SELECT

    -- stuff(name ,

    -- patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name) , --startposition

    -- patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1 , --length (extra char added to ensure length returned)

    -- replicate('x',patindex('%[^0-9]%', substring(name + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',name),99)) - 1) --replacement x's

    -- ) res,

    convert(varchar(max), res) as res,

    1 mycount,

    row_number() over (order by name) grouper

    FROM #t1

    UNION ALL

    SELECT

    stuff(res ,

    patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) , --startposition

    patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1 , --length

    replicate('x',patindex('%[^0-9]%', substring(res + '@',patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res),99)) - 1) --replacement x's

    ) res,

    mycount + 1,

    grouper

    FROM cte

    WHERE patindex('%[0-9][0-9][0-9][0-9][0-9][0-9]%',res) > 0

    ),

    maxcte as (select grouper, max(mycount) lastres from cte group by grouper)

    SELECT res

    FROM cte c

    inner join maxcte m on mycount = lastres and c.grouper = m.grouper

  • mickyT

    SSChampion

    Points: 10360

    ben.brugman (10/8/2012)


    A small change See in the code.

    (My reason for this change change is: The essential part of the code is only written once, so it is smaller, clearer and with less riscs for errors).

    Thanks for submitting this code, I am learning from this.

    Ben Brugman

    I agree, that change makes it nicer. Cheers:-)

    ditto on the learning ... and I have enjoyed thinking about this problem 😎

    Normally I would have used a set of CLR functions that I have that replicate the oracle regular expression functions.

  • Dwain Camps

    SSC Guru

    Points: 86893

    Just for fun, here's a nasty piece of work:

    create table #t1(name varchar (100))

    INsert into #t1

    select '1234ABC123456XYZ1234567890ADS'

    INsert into #t1

    select 'cbv736456XYZ543534534545XLS'

    ;WITH ChunkIt (RowID, n, str1, str2, str3) AS (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), n=1

    ,CASE WHEN dig < alp THEN SUBSTRING(name, 1, alp-1) ELSE SUBSTRING(name, 1, dig-1) END

    ,CASE WHEN dig < alp THEN SUBSTRING(name, alp, LEN(name)) ELSE SUBSTRING(name, dig, LEN(name)) END

    ,CAST('' AS VARCHAR(100))

    FROM #t1

    CROSS APPLY (SELECT PATINDEX('%[0-9]%', name), PATINDEX('%[A-Za-z]%', name)) a(dig, alp)

    UNION ALL

    SELECT RowID, n+1

    ,b.str1

    ,b.str2

    ,CASE WHEN PATINDEX('%[0-9]%', b.str2) = 0 OR PATINDEX('%[A-Za-z]%', b.str2) = 0 THEN b.str2 ELSE '' END

    FROM ChunkIt

    CROSS APPLY (SELECT PATINDEX('%[0-9]%', str2), PATINDEX('%[A-Za-z]%', str2)) a(dig, alp)

    CROSS APPLY (

    SELECT CASE WHEN a.dig < a.alp THEN SUBSTRING(str2, 1, a.alp-1) ELSE SUBSTRING(str2, 1, a.dig-1) END

    ,CASE WHEN a.dig < a.alp THEN SUBSTRING(str2, a.alp, LEN(str2)) ELSE SUBSTRING(str2, a.dig, LEN(str2)) END

    ) b(str1, str2)

    WHERE a.dig > 0 AND a.alp > 0

    ),

    ForGrouping AS (

    SELECT RowID, n

    ,str1=CASE WHEN LEN(str1) >= 6 AND PATINDEX('%[0-9]%', str1) > 0 THEN REPLICATE('x', LEN(str1)) ELSE str1 END

    ,str2=CASE WHEN LEN(str3) >= 6 AND PATINDEX('%[0-9]%', str3) > 0 THEN REPLICATE('x', LEN(str3)) ELSE str3 END

    FROM ChunkIt

    )

    SELECT name=(

    SELECT str1 + str2

    FROM ForGrouping b

    WHERE a.RowID = b.RowID

    FOR XML PATH(''))

    FROM ForGrouping a

    GROUP BY RowID

    ORDER BY RowID

    DROP TABLE #t1


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 15 posts - 1 through 15 (of 45 total)

You must be logged in to reply to this topic. Login to reply