Split string into table, separator can be more than 1 char

Question

Split string into table, separator can be more than 1 char

halford13

Say Hey Kid

Points: 682
More actions
November 11, 2009 at 2:20 am

#216544

Comments posted to this topic are about the item Split string into table, separator can be more than 1 char

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

phegel SSC Enthusiast Points: 123 More actions · Answer 1

I like your function, because you use a CTE.:-D

I've been learning how and when to use CTE's and this is just another item that has helped me to better understand them.

This funciton has been done before, but I haven't seen it using a CTE.

Thanks!

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 2

Not a bad solution but using a recursive CTE and a multi-statement TVF this function is not very scalable.

Here is another solution to this problem:

CREATE function [dbo].[DelimitedSplit] (

@pString varchar(max),

@pDelimiter char(1)

)

returns table

as

return

with

a1 as (select 1 as N union all

select 1 union all

select 1),

a2 as (select

1 as N

from

a1 as a

cross join a1 as b),

a3 as (select

1 as N

from

a2 as a

cross join a2 as b),

a4 as (select

1 as N

from

a3 as a

cross join a2 as b),

Tally as (select top (len(@pString))

row_number() over (order by N) as N

from

a4),

ItemSplit(

ItemOrder,

Item

) as (

SELECT

N,

SUBSTRING(@pDelimiter + @pString + @pDelimiter,N + 1,CHARINDEX(@pDelimiter,@pDelimiter + @pString + @pDelimiter,N + 1) - N - 1)

FROM

Tally

WHERE

N < LEN(@pDelimiter + @pString + @pDelimiter)

AND SUBSTRING(@pDelimiter + @pString + @pDelimiter,N,1) = @pDelimiter --Notice how we find the delimiter

)

select

row_number() over (order by ItemOrder) as ItemID,

Item

from

ItemSplit

Paul White SSC Guru Points: 150467 More actions · Answer 3

http://sqlblog.com/blogs/adam_machanic/archive/2009/04/28/sqlclr-string-splitting-part-2-even-faster-even-more-scalable.aspx

FYI

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 4

It's a handy function, and a well written article. But, although I much prefer CTEs to subqueries, recursive CTEs are notoriously slow for this type of problem. Use of a tally table (also known as number table) for parsing delimited strings will do the same work in a fraction of the time, and there are even faster techniques in certain situations.

If you are not familiar with tally tables, it is really worth your while to become familiar with them. Jeff Moden's excellent article on the subject can be found here[/url].

For an amazing group discussion about parsing delimited strings, look here.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

halford13 Say Hey Kid Points: 682 More actions · Answer 5

Hi,

You're right, using the Tally table is faster:

using this parameter:

DECLARE @Parameter VARCHAR(8000)

SET @Parameter = REPLICATE('Element01,Element02,Element03,Element04,Element05,',159)

So I got next execution times (running the same script several times):

- my function: 120 ~ 180 miliseconds

- Lynn's function: 90 ~ 140 miliseconds

- Jeff's function (a function created based on Jeff's example, using Tally table (already created on my database) : 90 ~ 140 miliseconds

Thanks!

Thanks.

phegel SSC Enthusiast Points: 123 More actions · Answer 6

This is the function that I am currently using, is it more efficient than the CTE ?

create FUNCTION [dbo].[func_CreateClientTableOrdered]

(

@item_list as varchar(4000)

)

RETURNS @Items Table(OrderNo int, Item varchar(60))

AS

BEGIN

declare @separator as char(1)

,@position as int

,@item as varchar(60)

,@OrderNoas int

set @OrderNo = 0

SET @separator = ','

SET @item_list = @item_list + @separator

SELECT @position = patindex('%'+@separator+'%', @item_list)

WHILE @position <> 0

BEGIN

SELECT @Item = left(@item_list, @position - 1)

SELECT @item_list = stuff(@item_list, 1, @position, '')

SELECT @position= patindex('%'+@separator+'%', @item_list)

INSERT @Items(OrderNo, Item)

SELECT @OrderNo, @Item

SELECT @OrderNo= @OrderNo + 1

END

RETURN

END

Can someone give me a good way of proving which function would perform better?

Or point me to an article on performance tuning functions. Thanks, Paul

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 7

phegel (11/20/2009)
This is the function that I am currently using, is it more efficient than the CTE ?
create FUNCTION [dbo].[func_CreateClientTableOrdered]
(
@item_list as varchar(4000)
)
RETURNS @Items Table(OrderNo int, Item varchar(60))
AS
BEGIN
declare @separator as char(1)
,@position as int
,@item as varchar(60)
,@OrderNoas int
set @OrderNo = 0
SET @separator = ','
SET @item_list = @item_list + @separator
SELECT @position = patindex('%'+@separator+'%', @item_list)
WHILE @position <> 0
BEGIN
SELECT @Item = left(@item_list, @position - 1)
SELECT @item_list = stuff(@item_list, 1, @position, '')
SELECT @position= patindex('%'+@separator+'%', @item_list)
INSERT @Items(OrderNo, Item)
SELECT @OrderNo, @Item
SELECT @OrderNo= @OrderNo + 1
END
RETURN
END
Can someone give me a good way of proving which function would perform better?
Or point me to an article on performance tuning functions. Thanks, Paul

First, look up in this thread, I posted a function that should perform better. Also, there is a link to another thread that contains a very good discussion on string parsing.

I can see two issues with your function. One, it is using a multi-statement TVF. Second, it is using a while loop. Both of these will keep your function from scaling well.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 8

halford13 (11/20/2009)
Hi,
You're right, using the Tally table is faster:
using this parameter:
So I got next execution times (running the same script several times):
- my function: 120 ~ 180 miliseconds
- Lynn's function: 90 ~ 140 miliseconds
- Jeff's function (a function created based on Jeff's example, using Tally table (already created on my database) : 90 ~ 140 miliseconds
Thanks!
Thanks.

I think you'll find that my function uses Jeff's split method, it just incorporates the tally table into the function itself.

halford13 Say Hey Kid Points: 682 More actions · Answer 9

halford13

Say Hey Kid

Points: 682

November 20, 2009 at 11:15 am

#1082376

Hi Lynn,

Yes, I saw you used Jeff's split method.

e-ghetto SSChasing Mays Points: 606 More actions · Answer 10

Erland Sommarskog still knows it best:

http://www.sommarskog.se/arrays-in-sql-2005.html

😛