A Quick TRIM, Sir?

Phil Factor, 2011-01-28

The other day, I was answering a question on the ASK.SQLSERVERCENTRAL.COM forum from a programmer who had hit a

problem where he thought that LTRIM and RTRIM had failed to work. Although, I wasn’t entirely certain, I felt sure it

was because of the ‘Nul’ characters that he admitted had gotten into his VARCHAR column values. The ASCII ‘Nul’ character,

‘control At’ (^@) is, in SQL, CHAR(0). It should not to be confused with SQL’s NULL, meaning unknown

LTRIM and RTRIM consider a ‘Nul’ character to be a ‘non-space’ character, and will quit trimming at that point, whereas it

is represented by the result pane of the Query Window of SSMS as a space. The same is true of any other character that

is represented as spaces between words but isn’t a space character, such as the non-breaking space Char(160). This can

cause hours of frustration to a developer trying to track down a bug. LTRIM and RTRIM are pretty-well broken as they

don’t work like the TRIM functions of more enlightened languages.

Firstly, here are some examples of LTRIM and RTRIM doing things that are right and wrong.

SELECT  paddedWord, ‘{‘ + LTRIM(RTRIM(paddedWord)) + ‘}’ AS trimmed,
        
‘{‘ + LTRIM(RTRIM(CHAR(0) + paddedWord + CHAR(0))) + ‘}’ AS with_Nuls
FROM    (SELECT CHAR(9) + ‘ leading-tab ‘ AS paddedWord
        
UNION ALL
        
SELECT ‘      trimmable     ‘
        
UNION ALL
        
SELECT CHAR(160) + ‘   un-trimmable    ‘ + CHAR(160)) x
/*
paddedWord            trimmed                 with_Nuls
——————— ———————– ————————-
    leading-tab         {   leading-tab}         {      leading-tab  }
      trimmable       {trimmable}             {       trimmable      }
    un-trimmable      {    un-trimmable     } {     un-trimmable      }

 So, we can see that LTRIM and RTRIM don’t even conform to the marginally useful C tradition of counting space, tab, line feed, and carriage return characters as trimmable whitespace. WhiteSpace should be language-specific and will include many non-word characters. (some languages have a very different ideas as to which of the Unicode characters represent part of a word, and what represents whitespace). One of the first things a grey-muzzle programmer does when creating a database is to create a ‘trim’ function that works reasonably. So what shoud this be? There is quite a problem with TRIM, especially if one wants a function that will work for a variety of languages in Unicode, simply because the

TSQL string functions can’t reliably handle the NCHAR(0) character. However, if we just stick to the European languages and the VARCHARwe’re fairly safe

IF OBJECT_ID(N’Trim’) IS NOT NULL
  
DROP FUNCTION Trim
GO
CREATE FUNCTION Trim
/**
summary:   >
This procedure returns a string with all leading and trailing blank space removed. It is similar to the TRIM functions in most current computer languages. You can change the value of the string assigned to @BlankRange, which is then used by the PATINDEX function. The string can be a rangee.g. a-g or a list of characters such as abcdefg.

Author: Phil Factor
Revision: 1.1 changed list of control character to neater range.
date: 28 Jan 2011
example:
     – code: dbo.Trim(‘  678ABC   ‘)
     – code: dbo.Trim(‘  This has leading and trailing spaces  ‘)
     – code: dbo.Trim(‘  left-Trim This’)
     – code: dbo.Trim(‘Right-Trim This      ‘)
returns:   >
Input string without trailing or leading blank characters, however these characters are defined in @BlankRange

**/ (@String VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
  DECLARE
@BlankRange CHAR(255),
    
@FirstNonBlank INT,
    
@LastNonBlank INT
  IF
@String IS NULL
    
RETURN NULL–filter out null strings
  
SELECT  @BlankRange = CHAR(0) + ‘- ‘ + CHAR(160)
  
/* here is where you set your definition of what constitutes a blank character. We’ve just chosen every ‘control’ character, the space character and the non-breaking space. Your requirements could be different!*/
  
SELECT  @FirstNonBlank = PATINDEX(‘%[^’ + @BlankRange + ‘]%’, @String  collate SQL_Latin1_General_CP850_Bin)
  
SELECT  @lastNonBlank = 1 + LEN(@String + ‘|’) – (PATINDEX(‘%[^’ + @BlankRange + ‘]%’,
                                                            
REVERSE(@Stringcollate SQL_Latin1_General_CP850_Bin))
  
IF @FirstNonBlank > 0
    
RETURN SUBSTRING(@String,@FirstNonBlank, @LastNonBlank@firstNonBlank)
  
RETURN –nothing would be left    
END
GO

Rate

Share

Share

Rate

Related content

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

Robert Davis

2009-02-23

1,567 reads

Networking – Part 4

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I’d like to talk about social networking. We’ll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let…

Andy Warren

2009-02-17

1,530 reads

Speaking at Community Events – More Thoughts

Last week I posted Speaking at Community Events – Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I’ve got a few more thoughts on the topic this week, and I look forward to your comments.

Andy Warren

2009-02-13

360 reads