Multiple Space into one + 65,33,17,9,5,3,2

  • All,

    I have recently read Jeff Moden's Multiple space into one space article and i saw M's reply on this topic.

    He gave the below solution to resolve the same issue.

    set @s-2 = '*' + replicate(' ',7998) + '*'

    select len(@s)

    set @s-2 = replace(replace(replace(replace(replace(replace(ltrim(rtrim(@s)),

    replicate(' ',32+1),' '),replicate(' ',16+1),' '),replicate(' ',8+1),' '),replicate(' ',4+1),' '),replicate(' ',2+1),' '),replicate(' ',1+1),' ')

    select len(@s)

    select @s-2

    CREATE FUNCTION dbo.fn_CleanUp_MichaelMeierruth_MKII(@S VARCHAR(8000))

    RETURNS VARCHAR(8000)

    BEGIN

    RETURN replace(replace(replace(replace(replace(replace(replace(ltrim(rtrim(@s)),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' ')

    END

    Seems like both are same except one extra REPLACE.

    Spaces are manually added and in the first one REPLICATE function is used to do the same.

    My question is, what is this number denotes?

    65,33,17,9,5,3,2

    On what basis, the above numbers are selected. what is the mathematics behind these numbers?

    on what basis we have to choose the number?

    say for example , I have to delete space from the below string.

    select 'karthik keyan 44768 sql server 2008'

    which number i have to use? how many REPLACE i have to use?

    karthik

  • karthikeyan-444867 (9/24/2012)


    All,

    I have recently read Jeff Moden's Multiple space into one space article and i saw M's reply on this topic.

    He gave the below solution to resolve the same issue.

    set @s-2 = '*' + replicate(' ',7998) + '*'

    select len(@s)

    set @s-2 = replace(replace(replace(replace(replace(replace(ltrim(rtrim(@s)),

    replicate(' ',32+1),' '),replicate(' ',16+1),' '),replicate(' ',8+1),' '),replicate(' ',4+1),' '),replicate(' ',2+1),' '),replicate(' ',1+1),' ')

    select len(@s)

    select @s-2

    CREATE FUNCTION dbo.fn_CleanUp_MichaelMeierruth_MKII(@S VARCHAR(8000))

    RETURNS VARCHAR(8000)

    BEGIN

    RETURN replace(replace(replace(replace(replace(replace(replace(ltrim(rtrim(@s)),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' ')

    END

    Seems like both are same except one extra REPLACE.

    Spaces are manually added and in the first one REPLICATE function is used to do the same.

    My question is, what is this number denotes?

    65,33,17,9,5,3,2

    On what basis, the above numbers are selected. what is the mathematics behind these numbers?

    on what basis we have to choose the number?

    say for example , I have to delete space from the below string.

    select 'karthik keyan 44768 sql server 2008'

    which number i have to use? how many REPLACE i have to use?

    HUH???

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • say for example , I have to delete space from the below string.

    select 'karthik keyan 44768 sql server 2008'

    which number i have to use? how many REPLACE i have to use?

    One.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Deleting space completely is an entirely different exercise. Like Phil mentioned - that just takes one REPLACE to do that.

    The numbers were chosen to avoid having to run the cleaning process multiple times: it makes sure that the replace doesn't leave a smaller set that still is more than 1 in a row without it having to do a ludicrous amount of operations within a given string.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Your request concerned removing a single space, while Jeff's arcticle's objective was:

    This article explains the "puzzle solving" behind a common set based method to replace unknown numbers of adjacent spaces with a single space

    Bold emphasis added by this poster

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • karthikeyan-444867 (9/24/2012)


    All,

    I have recently read Jeff Moden's Multiple space into one space article and i saw M's reply on this topic.

    He gave the below solution to resolve the same issue.

    set @s-2 = '*' + replicate(' ',7998) + '*'

    select len(@s)

    set @s-2 = replace(replace(replace(replace(replace(replace(ltrim(rtrim(@s)),

    replicate(' ',32+1),' '),replicate(' ',16+1),' '),replicate(' ',8+1),' '),replicate(' ',4+1),' '),replicate(' ',2+1),' '),replicate(' ',1+1),' ')

    select len(@s)

    select @s-2

    CREATE FUNCTION dbo.fn_CleanUp_MichaelMeierruth_MKII(@S VARCHAR(8000))

    RETURNS VARCHAR(8000)

    BEGIN

    RETURN replace(replace(replace(replace(replace(replace(replace(ltrim(rtrim(@s)),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' '),

    ' ',' ')

    END

    Seems like both are same except one extra REPLACE.

    Spaces are manually added and in the first one REPLICATE function is used to do the same.

    My question is, what is this number denotes?

    65,33,17,9,5,3,2

    On what basis, the above numbers are selected. what is the mathematics behind these numbers?

    on what basis we have to choose the number?

    say for example , I have to delete space from the below string.

    select 'karthik keyan 44768 sql server 2008'

    which number i have to use? how many REPLACE i have to use?

    Most of the numbers come from powers of 2 + 1. In this particular case, it's an optimization for performance.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • bitbucket-25253 (9/24/2012)


    Your request concerned removing a single space, while Jeff's arcticle's objective was:

    This article explains the "puzzle solving" behind a common set based method to replace unknown numbers of adjacent spaces with a single space

    Bold emphasis added by this poster

    oops...

    my string is

    select 'karthik keyan 44768 sql server 2008 '

    I wrongly used only one space between all the string. Now, How many REPLACE I have to use? On what basis the number of REPLACE selected?

    karthik

  • karthikeyan-444867 (9/24/2012)


    bitbucket-25253 (9/24/2012)


    Your request concerned removing a single space, while Jeff's arcticle's objective was:

    This article explains the "puzzle solving" behind a common set based method to replace unknown numbers of adjacent spaces with a single space

    Bold emphasis added by this poster

    oops...

    my string is

    select 'karthik keyan 44768 sql server 2008 '

    I wrongly used only one space between all the string. Now, How many REPLACE I have to use? On what basis the number of REPLACE selected?

    In order to more readily view the code I have replaced blanks (spaces) with asteriks

    DECLARE @k VARCHAR(46)

    SET @k = 'karthik****keyan***44768**sql*server*2008**' --As modified

    SELECT @k,REPLACE(@K,'*','')

    /* Result:

    karthik****keyan***44768**sql*server*2008**

    karthikkeyan44768sqlserver2008

    */

    Hopefully to make the REPLACE function more understandable I did the following

    DECLARE @k VARCHAR(46)

    SET @k = 'karthik****keyan***44768**sql*server*2008**' --As modified

    SELECT @k,REPLACE(@K,'**','')

    /* Results:

    karthik****keyan***44768**sql*server*2008**

    karthikkeyan*44768sql*server*2008

    */

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • i think...Again my requiremtn is wrong..

    I just corrected it now...

    select ''karthik keyan 44768 sql server 2008 '

    Expected Output:

    select ''karthik keyan 44768 sql server 2008'

    i.e single space between each and every word.

    If I use the below code, it will replace just all the spaces. right?

    DECLARE @k VARCHAR(46)

    SET @k = 'karthik****keyan***44768**sql*server*2008**' --As modified

    SELECT @k,REPLACE(@K,'**','')

    /* Results:

    karthik****keyan***44768**sql*server*2008**

    karthikkeyan*44768sql*server*2008

    */

    Now..How many REPLACE i have to use? On what basis the number of REPLACE will be choosed?

    karthik

  • Most of the numbers come from powers of 2 + 1. In this particular case, it's an optimization for performance

    okay. why don't i use +2 or +3 instead of +1 from power of 2?

    what is the logic behind this?

    karthik

  • I have also done some quick simulations to see maximum lengths of strings that can be handled with each value of N. These are the results so far:

    N=1 10

    N=2 38

    N=3 286

    N=4 4622

    N=5 151534

    How to identify/find this ?

    karthik

  • karthikeyan-444867 (9/25/2012)


    Most of the numbers come from powers of 2 + 1. In this particular case, it's an optimization for performance

    okay. why don't i use +2 or +3 instead of +1 from power of 2?

    what is the logic behind this?

    Hmmmm... good question. I trusted in what many others said about the +1 being an optimization but made the terrible mistake of not testing it. I have much more testing to do but here's an example of where the +1 actually doesn't appear to be an optimization.

    --Using the supposed +1 optimization (4 REPLACEs did something)

    XXXXXXXXXXXXXXX --15 SPACES

    0XXXXXX --REPLACED 9 SPACES (2^3+1) WITH 1

    000X --REPLACED 5 SPACES (2^2+1) WITH 1

    0X --REPLACED 3 SPACES (2^0+1) WITH 1

    0 --REPLACED 2 SPACES WITH 1 --Done

    0 --REPLACE 2 SPACES WITH 1 --Nothing to do

    --Using straight powers of 2 (3 REPLACEs did something)

    XXXXXXXXXXXXXXX --15 SPACES

    0XXXXXXX --REPLACED 8 SPACES (2^3) WITH 1

    00 --REPLACED 4 SPACES (2^2) TWICE WITH 1

    0 --REPLACED 2 SPACES (2^1) WITH 1 --Done

    0 --REPLACE 2 SPACES WITH 1 --Nothing to do

    Maybe the +1 is an optimization because it might be less expensive to execute 2 separate REPLACEs than it is to have 1 REPLACE do 2 replacements. Only testing for performance would tell.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Using the code from Jeff Moden's article:

    DECLARE @OriginalString VARCHAR(70)

    SET @OriginalString = 'karthik keyan 44768 sql server 2008 '

    SELECT REPLACE(

    REPLACE(

    REPLACE(

    LTRIM(RTRIM(@OriginalString))

    ,' ',' '+CHAR(7)) --Changes 2 spaces to the OX model

    ,CHAR(7)+' ','') --Changes the XO model to nothing

    ,CHAR(7),'') AS CleanString --Changes the remaining X's to nothing

    WHERE CHARINDEX(' ',@OriginalString) > 0

    Result:

    CleanString:

    karthik keyan 44768 sql server 2008

    Now changing your string just to show that Jeff's code works correctly

    DECLARE @OriginalString VARCHAR(70)

    --SET @OriginalString = 'karthik keyan 44768 sql server 2008 '

    SET @OriginalString = 'karthik keyan 44768 sql server 2008 '

    SELECT REPLACE(

    REPLACE(

    REPLACE(

    LTRIM(RTRIM(@OriginalString))

    ,' ',' '+CHAR(7)) --Changes 2 spaces to the OX model

    ,CHAR(7)+' ','') --Changes the XO model to nothing

    ,CHAR(7),'') AS CleanString --Changes the remaining X's to nothing

    WHERE CHARINDEX(' ',@OriginalString) > 0

    /* Result

    CleanString

    karthik keyan 44768 sql server 2008

    */

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

  • karthikeyan-444867 (9/25/2012)


    Most of the numbers come from powers of 2 + 1. In this particular case, it's an optimization for performance

    okay. why don't i use +2 or +3 instead of +1 from power of 2?

    what is the logic behind this?

    The idea is to come up with a non-overlapping descending series. The challenge you're trying to avoid is that the space you're putting back in doesn't get looked at again during the same replace, so you have to structure your numbers so that the max number of spaces it might leave behind doesn't consitute yet another match it should have removed. So the +1 is an attempt to prevent that overrrun from happening (and you want this to be as small as possible, so 1 is better than other options).

    That said - any number of progressions could work just fine. This one seems to return the correct results consistently as well:

    65537,257,17,5,3,2

    it's a matter of finding the sweet spot for your data (the progression you choose will determine how many replacements interally are occurring for different data sizes, so a less aggressive progression might do fewer replacements on smaller strings, but a lot more on larger ones).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • bitbucket-25253 (9/25/2012)


    Using the code from Jeff Moden's article:

    DECLARE @OriginalString VARCHAR(70)

    SET @OriginalString = 'karthik keyan 44768 sql server 2008 '

    SELECT REPLACE(

    REPLACE(

    REPLACE(

    LTRIM(RTRIM(@OriginalString))

    ,' ',' '+CHAR(7)) --Changes 2 spaces to the OX model

    ,CHAR(7)+' ','') --Changes the XO model to nothing

    ,CHAR(7),'') AS CleanString --Changes the remaining X's to nothing

    WHERE CHARINDEX(' ',@OriginalString) > 0

    Result:

    CleanString:

    karthik keyan 44768 sql server 2008

    Now changing your string just to show that Jeff's code works correctly

    DECLARE @OriginalString VARCHAR(70)

    --SET @OriginalString = 'karthik keyan 44768 sql server 2008 '

    SET @OriginalString = 'karthik keyan 44768 sql server 2008 '

    SELECT REPLACE(

    REPLACE(

    REPLACE(

    LTRIM(RTRIM(@OriginalString))

    ,' ',' '+CHAR(7)) --Changes 2 spaces to the OX model

    ,CHAR(7)+' ','') --Changes the XO model to nothing

    ,CHAR(7),'') AS CleanString --Changes the remaining X's to nothing

    WHERE CHARINDEX(' ',@OriginalString) > 0

    /* Result

    CleanString

    karthik keyan 44768 sql server 2008

    */

    The problem with that is stated at the beginning of the article. My method is much less optimal than the methods revealed in the discussion attached to the article.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply