SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SPLITTING A STRING


SPLITTING A STRING

Author
Message
hemanth T
hemanth T
Valued Member
Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)

Group: General Forum Members
Points: 55 Visits: 122
HI ALL,

I HAVE A SCENARIO LIKE THIS,I HAVE A STRING (GIVEN FROM FRONT END FROM A CSV FILE)-- AS PARAMETER AND I NEED TO SPLIT THE STRING AND STORE THE ROWS INTO RECORD VALUES AND STORE THEM INTO A TABLE ---IN A STORED PROCEDURE
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)

Group: General Forum Members
Points: 84499 Visits: 41062
Heh... first... the use of all caps usually ticks people off because it's an indication of yelling on these forums. Just so you know. ;-)

Second, I'll let the code do the talking. Here's what I use in production especially when the resident DBA won't let me build a Tally Table...

 CREATE FUNCTION dbo.DelimitedSplit8K
/***************************************************************************************************
Purpose:
Split a given string at a given delimiter and return a list of the split elements (items).

Returns:
iTVF containing the following:
ItemNumber = Element position of Item as a BIGINT (not converted to INT to eliminate a CAST)
Item = Element value as a VARCHAR(8000)

CROSS APPLY Usage Example:
---------------------------------------------------------------------------------------------------
--===== Conditionally drop the test tables to make reruns easier for testing.
-- (this is NOT a part of the solution)
IF OBJECT_ID('tempdb..#JBMTest') IS NOT NULL
DROP TABLE #JBMTest
;
--===== Create and populate a test table on the fly (this is NOT a part of the solution).
SELECT *
INTO #JBMTest
FROM (
SELECT 1,'1,10,100,1000,10000,100000,1000000' UNION ALL
SELECT 2,'2000000,200000,20000,2000,200,20,2' UNION ALL
SELECT 3, 'This,is,a,test' UNION ALL
SELECT 4, 'and so is this' UNION ALL
SELECT 5, 'This, too (no pun intended)'
) d (SomeID,SomeValue)
;
GO
--===== Split the CSV column for the whole table using CROSS APPLY (this is the solution)
SELECT test.SomeID, split.ItemNumber, split.Item
FROM #JBMTest test
CROSS APPLY
(
SELECT ItemNumber, Item
FROM dbo.DelimitedSplit8k(test.SomeValue,',')
) split
;
---------------------------------------------------------------------------------------------------
Notes:
1. Optimized for VARCHAR(7999) or less. No testing or error reporting for truncation at 7999
characters is done.
2. Optimized for single character delimiter. Multi-character delimiters should be resolved
externally from this function.
3. Optimized for use with CROSS APPLY.
4. Does not "trim" elements just in case leading or trailing blanks are intended.
5. If you don't know how a Tally table can be used to replace loops, please see the following...
http://www.sqlservercentral.com/articles/T-SQL/62867/
6. Changing this function to use VARCHAR(MAX) will cause it to run twice as slow. It's just the
nature of VARCHAR(MAX) whether it fits in-row or not.
7. Multi-machine testing for the method of using UNPIVOT instead of 10 SELECT/UNION ALLs shows
that the UNPIVOT method is quite machine dependent and can slow things down quite a bit.
8. Performance testing shows using "TOP" for the limiting criteria of "N" is actually
slower and slightly more CPU intensive than the traditional WHERE N < LEN(@pString) + 2.
9. Performance testing shows using ORDER BY (SELECT x) where "x" is anything is actually
slower and slightly more CPU intensive than the traditional ORDER BY (SELECT N).

Credits:
This code is the product of many people's efforts including but not limited to the following:
cteTally concept originally by Iztek Ben Gan and "decimalized" by Lynn Pettis (and others) for a
bit of extra speed and finally redacted by Jeff Moden for a different slant on readability and
compactness. Hat's off to Paul White for his simple explanations of CROSS APPLY. Finally,
special thanks to Erland Sommarskog for his tireless efforts to help people understand
what you can actually do with T-SQL. I also thank whoever wrote the first article I ever saw
on "numbers tables" which is located at the following URL ...
http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html

Revision History:
Rev 00 - 20 Jan 2010 - Concept: Lynn Pettis and others.
Redaction/Implementation: Jeff Moden
- Base 10 redaction and reduction for CTE. (Total rewrite)

Rev 01 - 13 Mar 2010 - Jeff Moden
- Removed one additional concatenation and one subtraction from the SUBSTRING in the
SELECT List for that tiny bit of extra speed.

Rev 02 - 14 Apr 2010 - Jeff Moden
- No code changes. Added CROSS APPLY usage example to the header, some additional credits,
and extra documentation.

Rev 03 - 18 Apr 2010 - Jeff Moden
- No code changes. Added notes 7, 8, and 9 about certain "optimizations" that dont'
actually work for this type of function.

Rev 04 - 29 Jun 2010 - Jeff Moden
- Added WITH SCHEMABINDING thanks to a note by Paul White. This prevents an unnecessary
"Table Spool" when the function is used in an UPDATE statement even though the function
makes no external references.
***************************************************************************************************/
--===== Define I/O parameters
(
@pString VARCHAR(7999),
@pDelimiter CHAR(1)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
--===== "Inline" CTE Driven "Tally Table” produces values up to
-- 10,000... enough to cover VARCHAR(8000)
WITH
E1(N) AS ( --=== Create Ten 1's
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)
--===== Do the split
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@pString, N, CHARINDEX(@pDelimiter, @pString + @pDelimiter, N) - N) AS Item
FROM cteTally
WHERE N <= DATALENGTH(@pString)+1 --DATATLENGTH allows for trailing space delimiters
AND SUBSTRING(@pDelimiter + @pString, N, 1) = @pDelimiter
;
GO




--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
hemanth T
hemanth T
Valued Member
Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)Valued Member (55 reputation)

Group: General Forum Members
Points: 55 Visits: 122
hi,
thanks for your advice.surely i will not use caps hereafter,and also thank you for the solution u gave to my scenario.:-)
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)

Group: General Forum Members
Points: 84499 Visits: 41062
hemanth T (12/16/2010)
hi,
thanks for your advice.surely i will not use caps hereafter,and also thank you for the solution u gave to my scenario.:-)


Thanks for the feedback, hemanth. And since you're relatively new here, welcome aboard! :-)

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
David Benoit
David Benoit
Hall of Fame
Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)

Group: General Forum Members
Points: 3318 Visits: 3650
Hey Jeff - I certainly like that function better than most that I have seen and used. One question I do have though; have you noticed that it always seems to return an estimated number of rows being 100 when looking at the execution plan? It may be the two examples that I have tested so far but the one I had passed in 158 and in the second < 20. It does far better than other functions in that they always return an estimated rowcount of 1 which really botches the execution plan when you have a large string parsed.

I can start this as another thread but wondered if you have noticed this or something similar.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)SSC Guru (84K reputation)

Group: General Forum Members
Points: 84499 Visits: 41062
David Benoit (12/16/2010)
Hey Jeff - I certainly like that function better than most that I have seen and used. One question I do have though; have you noticed that it always seems to return an estimated number of rows being 100 when looking at the execution plan? It may be the two examples that I have tested so far but the one I had passed in 158 and in the second < 20. It does far better than other functions in that they always return an estimated rowcount of 1 which really botches the execution plan when you have a large string parsed.

I can start this as another thread but wondered if you have noticed this or something similar.


That IS interesting. I never actually bothered to look because it's never been a performance problem on VARCHAR(8000) for me. I believe it's because E1 has real data so it can figure out E2 will produce 100 rows if not filtered but I won't know until I take a deeper dive on the execution plan.

As a side bar... I also noticed that there's still some nasty implicit conversions in there because of ROW_NUMBER being a BIGINT. I'll have to get to playing with those one of these days.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
David Benoit
David Benoit
Hall of Fame
Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)Hall of Fame (3.3K reputation)

Group: General Forum Members
Points: 3318 Visits: 3650
Yeah, I will have to play with it a bit more as well. Also want to play with a tally table version to see what it does.

One other thing I noticed / ran into (similar to a wall) was that you can't use a hash join hint when joining directly to the function inline. I'm assuming that is due to the fact that it is ordered on ItemNumber and you are joining to Item but haven't played with it to figure that out yet either. I'll let you know when / if I do. Smile

Thanks again for this. Definitely helps!

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search