Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««5556575859»»»

Tally OH! An Improved SQL 8K “CSV Splitter” Function Expand / Collapse
Author
Message
Posted Tuesday, November 5, 2013 9:00 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Tuesday, August 19, 2014 8:18 AM
Points: 562, Visits: 2,617
Thanks Dwain!

-- Alan Burstein



Read this article for best practices on asking questions.
Need to split a string? Try this (Jeff Moden)
Need a pattern-based string spitter? Try this (Dwain Camps)
My blog
Post #1511705
Posted Wednesday, November 6, 2013 9:23 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Sunday, July 27, 2014 3:54 PM
Points: 393, Visits: 811
You can go one step further with Dwain's code and remove the table and take the two selects and combine them into a union all.

/* Anything is possible but is it worth it? */
Post #1511924
Posted Wednesday, November 6, 2013 12:49 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:33 AM
Points: 36,958, Visits: 31,465
Just to make sure that everyone knows where I stand on this...

There is absolutely no doubt in my mind that the SQLCLR version, like the one Paul White wrote for me for this article, is the best way to go. It doesn't have the performance problem that occurs when you join a Tally structure to a MAX datatype, it handles > 8000 characters as effeciently as it does <= 8000 characters, and it'll take either VARCHAR or NVARCHAR in stride without having to make any special consideration.

There ARE, however, shops that still won't allow SQLCLR and, if you can operate within the restrictions of the DelimitedSplit8K function, it's the fastest TSQL-only function that you're likely to run across and makes a decent, if not single purposed, second to the SQLCLR function.

To be sure, DelimitedSplit8K and its various permutations weren't designed to be uber flexible. They each were designed (like many intrinsic functions) to do one and only one thing well... split a string based on a single character delimiter using only TSQL.

I also want to thank all of the people that have put time and effort into this community-designed function over the years. As has been suggested in this discussion, I'm doing some additional testing (when I have the time) with adding the binary collation as well as a slightly different take on the cCTE for creating the Tally structure to see if we can get a bit more performance.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1511982
Posted Wednesday, November 6, 2013 5:31 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 4:14 AM
Points: 3,618, Visits: 5,254
Jeff Moden (11/6/2013)

I'm doing some additional testing (when I have the time) with adding the binary collation as well as a slightly different take on the cCTE for creating the Tally structure to see if we can get a bit more performance.


Great to hear! Idle hands are the devil's workshop.

Can't wait to see the next release of one of my favorite toolbox FUNCTIONS.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1512042
Posted Friday, March 7, 2014 1:12 PM


SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Tuesday, August 5, 2014 10:13 AM
Points: 647, Visits: 323
Hey Moden,

Just tried this for the first time. Bravo. Clean and very speedy!

Thanks
John.

Post #1548877
Posted Sunday, April 13, 2014 11:00 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, June 11, 2014 9:18 AM
Points: 67, Visits: 51
Thanks for your post. I wonder if you considered using the LEAD function to improve your split. The LEAD function is significantly faster than CHARINDEX. Below is a function that performs much better especially when handling large number of rows. Also, my split handles multi-character delimiters.

CREATE FUNCTION [dbo].[BetterSplit]
(@L NVARCHAR(MAX),@D NVARCHAR(100))
RETURNS TABLE AS RETURN
WITH A(N)AS(SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1)
,B(N)AS(SELECT 1FROM A,A B),C(N)AS(SELECT 1FROM B A,B),D(N)AS(SELECT 1FROM C A,C)
,T(N)AS(SELECT TOP(ISNULL(LEN(@L),0))ROW_NUMBER()OVER(ORDER BY(SELECT 1))FROM D)
,S AS(SELECT S=1UNION ALL SELECT S=N+LEN(@D)FROM T WHERE SUBSTRING(@L,N,LEN(@D))=@D)
,L AS(SELECT S,L=LEAD(S,1,LEN(@L)+LEN(@D)+1)OVER(ORDER BY S)-S-LEN(@D)FROM S)
SELECT Value=LTRIM(RTRIM(SUBSTRING(@L,S,L)))FROM L



Post #1561283
Posted Sunday, April 13, 2014 11:08 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:33 AM
Points: 36,958, Visits: 31,465
Miller (4/13/2014)
Thanks for your post. I wonder if you considered using the LEAD function to improve your split. The LEAD function is significantly faster than CHARINDEX. Below is a function that performs much better especially when handling large number of rows. Also, my split handles multi-character delimiters.

CREATE FUNCTION [dbo].[BetterSplit]
(@L NVARCHAR(MAX),@D NVARCHAR(100))
RETURNS TABLE AS RETURN
WITH A(N)AS(SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1)
,B(N)AS(SELECT 1FROM A,A B),C(N)AS(SELECT 1FROM B A,B),D(N)AS(SELECT 1FROM C A,C)
,T(N)AS(SELECT TOP(ISNULL(LEN(@L),0))ROW_NUMBER()OVER(ORDER BY(SELECT 1))FROM D)
,S AS(SELECT S=1UNION ALL SELECT S=N+LEN(@D)FROM T WHERE SUBSTRING(@L,N,LEN(@D))=@D)
,L AS(SELECT S,L=LEAD(S,1,LEN(@L)+LEN(@D)+1)OVER(ORDER BY S)-S-LEN(@D)FROM S)
SELECT Value=LTRIM(RTRIM(SUBSTRING(@L,S,L)))FROM L


Wonderful suggestion and, yes, I have indeed considered it, but someone already beat me to it. Please see the following article.
http://www.sqlservercentral.com/articles/SQL+Server+2012/106903/


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1561284
Posted Sunday, April 13, 2014 11:16 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 7:39 AM
Points: 23,227, Visits: 31,924
Jeff Moden (4/13/2014)
[quote]Miller (4/13/2014)
Thanks for your post. I wonder if you considered using the LEAD function to improve your split. The LEAD function is significantly faster than CHARINDEX. Below is a function that performs much better especially when handling large number of rows. Also, my split handles multi-character delimiters.

CREATE FUNCTION [dbo].[BetterSplit]
(@L NVARCHAR(MAX),@D NVARCHAR(100))
RETURNS TABLE AS RETURN
WITH A(N)AS(SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1)
,B(N)AS(SELECT 1FROM A,A B),C(N)AS(SELECT 1FROM B A,B),D(N)AS(SELECT 1FROM C A,C)
,T(N)AS(SELECT TOP(ISNULL(LEN(@L),0))ROW_NUMBER()OVER(ORDER BY(SELECT 1))FROM D)
,S AS(SELECT S=1UNION ALL SELECT S=N+LEN(@D)FROM T WHERE SUBSTRING(@L,N,LEN(@D))=@D)
,L AS(SELECT S,L=LEAD(S,1,LEN(@L)+LEN(@D)+1)OVER(ORDER BY S)-S-LEN(@D)FROM S)
SELECT Value=LTRIM(RTRIM(SUBSTRING(@L,S,L)))FROM L


Wonderful suggestion and, yes, I have indeed considered it, but someone already beat me to it. Please see the following article.
" target=_"blank" class="SmlLinks">[quote]

Broken or nonexistent link.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1561285
Posted Sunday, April 13, 2014 11:30 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Wednesday, June 11, 2014 9:18 AM
Points: 67, Visits: 51
Thanks Jeff,
I think you forgot to add the link in your quote.



Post #1561286
Posted Sunday, April 13, 2014 11:32 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:33 AM
Points: 36,958, Visits: 31,465
Lynn Pettis (4/13/2014)
Jeff Moden (4/13/2014)
[quote]Miller (4/13/2014)
Thanks for your post. I wonder if you considered using the LEAD function to improve your split. The LEAD function is significantly faster than CHARINDEX. Below is a function that performs much better especially when handling large number of rows. Also, my split handles multi-character delimiters.

CREATE FUNCTION [dbo].[BetterSplit]
(@L NVARCHAR(MAX),@D NVARCHAR(100))
RETURNS TABLE AS RETURN
WITH A(N)AS(SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1UNION ALL SELECT 1)
,B(N)AS(SELECT 1FROM A,A B),C(N)AS(SELECT 1FROM B A,B),D(N)AS(SELECT 1FROM C A,C)
,T(N)AS(SELECT TOP(ISNULL(LEN(@L),0))ROW_NUMBER()OVER(ORDER BY(SELECT 1))FROM D)
,S AS(SELECT S=1UNION ALL SELECT S=N+LEN(@D)FROM T WHERE SUBSTRING(@L,N,LEN(@D))=@D)
,L AS(SELECT S,L=LEAD(S,1,LEN(@L)+LEN(@D)+1)OVER(ORDER BY S)-S-LEN(@D)FROM S)
SELECT Value=LTRIM(RTRIM(SUBSTRING(@L,S,L)))FROM L


Wonderful suggestion and, yes, I have indeed considered it, but someone already beat me to it. Please see the following article.
" target=_"blank" class="SmlLinks">[quote]


Broken or nonexistent link.


Thanks, Lynn. I've repaired the link.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1561287
« Prev Topic | Next Topic »

Add to briefcase «««5556575859»»»

Permissions Expand / Collapse