SQLServerCentral Article

Tokenizing a String Using PARSENAME


This article describes an alternative and rather interesting way of doing string tokenizing in T-SQL.

The process of splitting a string to its tokens using a pre-defined separator is a subject discussed in many cases.  The Java and C languages have this facility built in the language definition. And I and others, in previous articles have shown the T-SQL classic implementation of tokenizing a string by "cutting" it to its pieces according to the separator.

I bring here another implementation of tokenizing using the PARSENAME function. We reach the Nth -1 position of the separator in the given string and then we replace the Nth  position of the separator with a '.' character. Then we return the activate the PARSENAME T-SQL function with value = 2.  SQL Server is fooled to believe that this is a valid SQL "object" and returns the desired token string as output.

This is the code implementation for the tokenizing process:

Create function dbo.fn_split_string
(@BigStr varchar(200),
 @seperator char(1),
 @ind int 
) returns varchar(200)
declare @xpos int
declare @i int
declare @res varchar(200)  
set @i=1
set @xpos = CharIndex (@seperator,@BigStr)
while @i < @ind
  set @i = @i + 1
  set @BigStr = substring (@BigStr, @xpos + 1,len(@BigStr) - @xpos )
  set @xpos = CharIndex (@seperator,@BigStr)
if @xpos = 0 
  set @res =  @BigStr
  set @BigStr = substring (@BigStr,1,@xpos - 1) + '.' + 
                substring (@BigStr, @xpos + 1,len(@BigStr) - @xpos )
  set @res = ParseName (@BigStr,2)
return @res

Here’s an example of a call to the function:

select dbo.fn_split_string ( 'eli@leiba@app@dba@in@iec' , '@', 1) as token1,
       dbo.fn_split_string ( 'eli@leiba@app@dba@in@iec' , '@',2) as token2,
       dbo.fn_split_string ( 'eli@leiba@app@dba@in@iec' , '@',3) as token3,
       dbo.fn_split_string ( 'eli@leiba@app@dba@in@iec' , '@',4) as token4,
       dbo.fn_split_string ( 'eli@leiba@app@dba@in@iec' , '@',5) as token5,
       dbo.fn_split_string ( 'eli@leiba@app@dba@in@iec' , '@',6) as token6

We will get the following results:

Token1         Token2                Token3            Token4          Token5        Token6
-------------- -------------------- ------------------ --------------- ------------- ------------ 
eli            leiba                app                dba             in            iec
(1 row(s) affected)


The function I presented here can be used as a general tool for tokenizing strings as shown.

Eli Leiba works at Israel Electric Company as a Senior Application DBA in Oracle and MS SQL Server. He also has certifications from Microsoft and BrainBench in Oracle and SQL Server database administration and implementation. Mr. Leiba holds a B.S. in Computer Science since 1991 and has 13 years' experience working in the databases field. Additionally Mr. Leiba teaches SQL Server DBA and Development courses at Microsoft CTEC and also serves as a senior database consultant for several Israeli start-up companies. (e-mail: iecdba@hotmail.com)


1 (1)




1 (1)