Blog Post

A Quick TRIM, Sir?

,

The other day, I was answering a question on the ASK.SQLSERVERCENTRAL.COM forum from a programmer who had hit a

problem where he thought that LTRIM and RTRIM had failed to work. Although, I wasn’t entirely certain, I felt sure it

was because of the 'Nul' characters that he admitted had gotten into his VARCHAR column values. The ASCII 'Nul' character,

'control At' (^@) is, in SQL, CHAR(0). It should not to be confused with SQL’s NULL, meaning unknown

LTRIM and RTRIM consider a 'Nul' character to be a ‘non-space’ character, and will quit trimming at that point, whereas it

is represented by the result pane of the Query Window of SSMS as a space. The same is true of any other character that

is represented as spaces between words but isn't a space character, such as the non-breaking space Char(160). This can

cause hours of frustration to a developer trying to track down a bug. LTRIM and RTRIM are pretty-well broken as they

don't work like the TRIM functions of more enlightened languages.

Firstly, here are some examples of LTRIM and RTRIM doing things that are right and wrong.

SELECT  paddedWord, '{' + LTRIM(RTRIM(paddedWord)) + '}' AS trimmed,
        
'{' + LTRIM(RTRIM(CHAR(0) + paddedWord + CHAR(0))) + '}' AS with_Nuls
FROM    (SELECT CHAR(9) + ' leading-tab ' AS paddedWord
        
UNION ALL
        
SELECT '      trimmable     '
        
UNION ALL
        
SELECT CHAR(160) + '   un-trimmable    ' + CHAR(160)) x
/*
paddedWord            trimmed                 with_Nuls
--------------------- ----------------------- -------------------------
    leading-tab         {   leading-tab}         {      leading-tab  }
      trimmable       {trimmable}             {       trimmable      }
    un-trimmable      {    un-trimmable     } {     un-trimmable      }

 So, we can see that LTRIM and RTRIM don't even conform to the marginally useful C tradition of counting space, tab, line feed, and carriage return characters as trimmable whitespace. WhiteSpace should be language-specific and will include many non-word characters. (some languages have a very different ideas as to which of the Unicode characters represent part of a word, and what represents whitespace). One of the first things a grey-muzzle programmer does when creating a database is to create a ‘trim’ function that works reasonably. So what shoud this be? There is quite a problem with TRIM, especially if one wants a function that will work for a variety of languages in Unicode, simply because the

TSQL string functions can't reliably handle the NCHAR(0) character. However, if we just stick to the European languages and the VARCHARwe're fairly safe

IF OBJECT_ID(N'Trim') IS NOT NULL
  
DROP FUNCTION Trim
GO
CREATE FUNCTION Trim
/**
summary:   >
This procedure returns a string with all leading and trailing blank space removed. It is similar to the TRIM functions in most current computer languages. You can change the value of the string assigned to @BlankRange, which is then used by the PATINDEX function. The string can be a rangee.g. a-g or a list of characters such as abcdefg.

Author: Phil Factor
Revision: 1.1 changed list of control character to neater range.
date: 28 Jan 2011
example:
     - code: dbo.Trim('  678ABC   ')
     - code: dbo.Trim('  This has leading and trailing spaces  ')
     - code: dbo.Trim('  left-Trim This')
     - code: dbo.Trim('Right-Trim This      ')
returns:   >
Input string without trailing or leading blank characters, however these characters are defined in @BlankRange

**/ (@String VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
  DECLARE
@BlankRange CHAR(255),
    
@FirstNonBlank INT,
    
@LastNonBlank INT
  IF
@String IS NULL
    
RETURN NULL--filter out null strings
  
SELECT  @BlankRange = CHAR(0) + '- ' + CHAR(160)
  
/* here is where you set your definition of what constitutes a blank character. We've just chosen every 'control' character, the space character and the non-breaking space. Your requirements could be different!*/
  
SELECT  @FirstNonBlank = PATINDEX('%[^' + @BlankRange + ']%', @String  collate SQL_Latin1_General_CP850_Bin)
  
SELECT  @lastNonBlank = 1 + LEN(@String + '|') - (PATINDEX('%[^' + @BlankRange + ']%',
                                                            
REVERSE(@Stringcollate SQL_Latin1_General_CP850_Bin))
  
IF @FirstNonBlank > 0
    
RETURN SUBSTRING(@String,@FirstNonBlank, @LastNonBlank-@firstNonBlank)
  
RETURN '' --nothing would be left    
END
GO

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating