SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Cleaning up excess contiguous spaces in strings. A mystery.

/* Have you ever come across the problem where you have to remove spaces from a string that has too many? You'll know that if you do the obvious ...*/
SELECT REPLACE ('this         has            too                          many                               spaces' ,'  ', ' ')
-- ... you end up with your problem only slightly diminished.
--this     has      too             many                spaces

/* you could, of course, just repeat this until all the extra spaces are gone, but this isn't going to help if you need to do this rapidly as an inline process as un update to a huge table of strings. Yet, this sort of data-cleaning is done a lot.
I must admit that I've always used a scalar function for this, which hid an iterative process. */
DECLARE @trimmed VARCHAR(255) ,
@LastTrimmed INT

@trimmed = 'this    has           too  many        spaces' ,
@LastTrimmed = 0

WHILE LEN(@Trimmed) <> @LastTrimmed
SELECT  @LastTrimmed = LEN(@Trimmed) ,
@trimmed = REPLACE(@trimmed, '  ', ' ')
SELECT  @Trimmed

/* but when I thought about it, (I was re-writing Robyn Page's String Manipulation Workbench for her, to bring it up to date) it seemed inefficient. This is a lot better as long as you don't have more than 208 continuous spaces. (no, you're not going to find that in an address) */

'this           has              too                            many                                 spaces' ,
REPLICATE(CHAR(32), 5), CHAR(32)),
REPLICATE(CHAR(32), 4), CHAR(32)),
REPLICATE(CHAR(32), 3), CHAR(32)),

/* If you test this out, you'll find that it will only clean out 38 contiguous spaces if you miss out the replacement for six consecutive spaces, and 10 if you miss out both the five and six consecutive spaces.
If, instead of 2 3 4 5 6, you chose 2 3 4 7 21 you get upto 461 contiguous spaces cleaned up rather then 208.
What is the maths behind this then?Is this the optimum sequence, or is there something better? (2 3 4 6 21 seems to do as well)

Help! Does anyone know of a helpful mathematician? I'd love an explanation


Posted by Mladen Prajdic on 1 October 2010

Why complicate things too much when you can just do this?

Declare @var varchar(1000)

Set @var = 'this         has            too                          many                               spaces'

Select @var, Replace(Replace(Replace(@var,' ',' '),' ',''),'','')

Posted by Phil Factor on 1 October 2010

Neat! Best yet

how about maybe... (slight refinement)

Declare @var varchar(1000)

Set @var = 'this         has            too                          many                               spaces'

Select  Replace(Replace(Replace(@var,CHAR(32),CHAR(32)+CHAR(160)),CHAR(160)+CHAR(32),''),CHAR(160),'' )

-- it uses the non-break space char which logically wouldn't be in any word

Posted by Phil Factor on 1 October 2010

... but I'm still intrigued by the problem of the sequence of spaces, in case anyone knows why.

Posted by arturmariojr on 24 February 2011

Hi, Phil. Your lessons were of great helpfull to my work and I will take the opportunity to thanks a lot!

This intriguing code you post result in a small programm into hp48 calculator. It showed that other number of spaces bigger than 208 would work too (try 406, but 214 doesn't work).

We must remember that replace passes once on original string replacing each sequence of 6 spaces creating another new string with one space; so, if the number of sets of 6 spaces is high it will result in high number of simple spaces, which will be added to the rest of division by 6; this new string will pass by 5 divider and so on. At the end, the division by 2 may result in larger than one space result.

Leave a Comment

Please register or log in to leave a comment.