March 27, 2013 at 2:36 pm
Hi guys,
For some performance considerations, we would like to try alternative to the code below.
Anyone could help on how to get this job done using CTE, temp table or I don not know...
Actually, I am not a dev guy at all ::-)
Thanks,
Mack
CREATE FUNCTION [dbo].[fn_Split](@data NVARCHAR(MAX), @delimiter NVARCHAR(5))
RETURNS @t TABLE (Idx int identity (1,1), data NVARCHAR(max))
AS
BEGIN
DECLARE @textXML XML;
SELECT @textXML = CAST('<d>' + REPLACE(@data, @delimiter, '</d><d>') + '</d>' AS XML);
INSERT INTO @t(data)
SELECT T.split.value('.', 'nvarchar(max)') AS data
FROM @textXML.nodes('/d') T(split)
RETURN
March 27, 2013 at 2:44 pm
Check out the resources with this article: http://www.sqlservercentral.com/articles/Tally+Table/72993/. You should find a CLR splitter that will meet your needs.
Be sure to read the article and the discussion with it.
March 27, 2013 at 3:00 pm
Thanks Lynn,
...I have gone through the article, too much complicated to me and definitely not my stuff...
The thing is I know the function above is killing perf, but I do not know how to write something else despite my readings.
The goal is to submit some results to the dev team and show how crappy then can produce their coding when not being concerned by global performance...
If anyone can "translate" this little function, ideally not in CLR, that would be helpful 🙂
March 27, 2013 at 3:02 pm
Does your function have to split a string declared as nvarchar(max)?
March 27, 2013 at 3:17 pm
Not sure this is absolutely necessary, maybe varchar(some values) could fit, but as of now, I do not know "where" they could use this function to split "what" amout of data...
It could some comments in a text box as it is commercial app...
nvarchar(max) type should be tested at least to compare performance.
March 27, 2013 at 3:28 pm
The two T-SQL functions, DelimitedSplit8K and DelimitedSplitN4K, that are attached to the article I referred you to are optimized to split varchar(8000) and nvarchar(4000) strings.
I would read the article and discussion again, as use these functions on appropriate sized strings to see how they work.
For string longer than varchar(8000) and nvarchar(4000) you will probably want to go to a CLR splitter as it will out perform a T-SQL based splitter.
March 27, 2013 at 4:24 pm
I am trying...but I do not succeed in adapting the 8k function.
Don't know how to "integrate" the <d>, <d/>, etc stuff as they exist in the initial function...
I really suck
March 27, 2013 at 5:57 pm
I have got it...
create function Split_fnOK
(
@datavarchar(8000),
@deli_char varchar(3)
)
returns @list table
(
Idxint,
datavarchar(8000)
)
as
begin
declare @from_locint
declare @to_locint
if charindex(@deli_char,@data,0) <= 0
begin
insert into @list(Idx, data) values (1, @data)
return
end
if charindex(@deli_char,@data,0) > 0
begin
select @from_loc= 0
select @to_loc= charindex(@deli_char,@data,0)
end
if charindex(@deli_char,@data,0) <= 0
begin
select @to_loc = null
end
while @to_loc is not null
begin
if substring(@data,@from_loc, @to_loc - @from_loc) <> ''
begin
insert into @list(Idx, data)
select isnull(max(Idx),0) + 1, substring(@data,@from_loc, @to_loc - @from_loc)
from@list
end
select @from_loc = charindex(@deli_char,@data,@from_loc+len(@deli_char)) + len(@deli_char)
select @to_loc = charindex(@deli_char,@data,@from_loc)
if @to_loc = 0
begin
if substring(@data,@from_loc, (len(@data) - @from_loc) + len(@deli_char)) <> ''
begin
insert into @list(Idx, data)
select isnull(max(Idx),0) + 1, substring(@data,@from_loc, (len(@data) - @from_loc) + len(@deli_char))
from@list
end
select @to_loc = null
end
end
return
end
go
With that set up, I have got much better stats time/cpu and I/O. Good.
Cheers
March 28, 2013 at 8:03 am
MackF (3/27/2013)
I have got it......
With that set up, I have got much better stats time/cpu and I/O. Good.
Cheers
This is going to be a LOT slower than the methods Lynn suggested. Actually orders of magnitude slower. I know you have been pointed to it before but check out the article in my signature about splitting strings. Then compare your looping function to DelimitedSplit8K and see which one is faster. 😛
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
March 28, 2013 at 8:05 pm
MackF (3/27/2013)
I have got it...create function Split_fnOK
(
@datavarchar(8000),
@deli_char varchar(3)
)
returns @list table
(
Idxint,
datavarchar(8000)
)
as
begin
declare @from_locint
declare @to_locint
if charindex(@deli_char,@data,0) <= 0
begin
insert into @list(Idx, data) values (1, @data)
return
end
if charindex(@deli_char,@data,0) > 0
begin
select @from_loc= 0
select @to_loc= charindex(@deli_char,@data,0)
end
if charindex(@deli_char,@data,0) <= 0
begin
select @to_loc = null
end
while @to_loc is not null
begin
if substring(@data,@from_loc, @to_loc - @from_loc) <> ''
begin
insert into @list(Idx, data)
select isnull(max(Idx),0) + 1, substring(@data,@from_loc, @to_loc - @from_loc)
from@list
end
select @from_loc = charindex(@deli_char,@data,@from_loc+len(@deli_char)) + len(@deli_char)
select @to_loc = charindex(@deli_char,@data,@from_loc)
if @to_loc = 0
begin
if substring(@data,@from_loc, (len(@data) - @from_loc) + len(@deli_char)) <> ''
begin
insert into @list(Idx, data)
select isnull(max(Idx),0) + 1, substring(@data,@from_loc, (len(@data) - @from_loc) + len(@deli_char))
from@list
end
select @to_loc = null
end
end
return
end
go
With that set up, I have got much better stats time/cpu and I/O. Good.
Cheers
Show us 2 lines of sample data to split so we can show you how to blow the doors off of everything but a CLR and come pretty close to that, to boot.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 29, 2013 at 4:57 am
I am now testing this one:
CREATE FUNCTION fnSplitString(@str nvarchar(max),@sep nvarchar(max))
RETURNS TABLE
AS
RETURN
WITH a AS(
SELECT CAST(0 AS BIGINT) as idx1,CHARINDEX(@sep,@str) idx2
UNION ALL
SELECT idx2+1,CHARINDEX(@sep,@str,idx2+1)
FROM a
WHERE idx2>0
)
SELECT SUBSTRING(@str,idx1,COALESCE(NULLIF(idx2,0),LEN(@str)+1)-idx1) as value
FROM a
Execution is more simple, and performs same than above with less CPU time consumed for optimizations then...
Ma data to parse could be just like that 'toto.toto,toto.toto,toto.'
Simple 🙂
March 29, 2013 at 7:38 am
MackF (3/29/2013)
I am now testing this one:CREATE FUNCTION fnSplitString(@str nvarchar(max),@sep nvarchar(max))
RETURNS TABLE
AS
RETURN
WITH a AS(
SELECT CAST(0 AS BIGINT) as idx1,CHARINDEX(@sep,@str) idx2
UNION ALL
SELECT idx2+1,CHARINDEX(@sep,@str,idx2+1)
FROM a
WHERE idx2>0
)
SELECT SUBSTRING(@str,idx1,COALESCE(NULLIF(idx2,0),LEN(@str)+1)-idx1) as value
FROM a
Execution is more simple, and performs same than above with less CPU time consumed for optimizations then...
Ma data to parse could be just like that 'toto.toto,toto.toto,toto.'
Simple 🙂
From your original post you stated that you wanted to find an alternative to an XML splitter.
For some performance considerations, we would like to try alternative to the code below.
I don't quite understand why you keep looking at slower alternatives than the one suggested. This is like going to a car dealer and telling them you want to have the fastest car on the lot. The guy show you the Lamborghini, you smile and nod your head and walk over the Pinto. You test drive it and it is in fact faster than the Pacer you are driving currently. The salesman reminds you that for the same price ($0 in t-sql land) that you could drive the MUCH MUCH MUCH faster Aventador but you say you want to keep testing out the Pinto. It just doesn't make sense. Read the article about the tally table splitter, look at the performance tests. There is some code that is almost identical to the one you posted. It was tested in that article and it was found to be magnitudes slower. Stop driving the Pinto and accept the free keys to your new high performance sports car.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply