Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

SPLITTING A STRING Expand / Collapse
Author
Message
Posted Wednesday, December 15, 2010 11:37 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, August 30, 2012 7:58 AM
Points: 35, Visits: 122
HI ALL,

I HAVE A SCENARIO LIKE THIS,I HAVE A STRING (GIVEN FROM FRONT END FROM A CSV FILE)-- AS PARAMETER AND I NEED TO SPLIT THE STRING AND STORE THE ROWS INTO RECORD VALUES AND STORE THEM INTO A TABLE ---IN A STORED PROCEDURE
Post #1035661
Posted Wednesday, December 15, 2010 11:50 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:45 PM
Points: 35,342, Visits: 31,875
Heh... first... the use of all caps usually ticks people off because it's an indication of yelling on these forums. Just so you know.

Second, I'll let the code do the talking. Here's what I use in production especially when the resident DBA won't let me build a Tally Table...

 CREATE FUNCTION dbo.DelimitedSplit8K
/***************************************************************************************************
Purpose:
Split a given string at a given delimiter and return a list of the split elements (items).

Returns:
iTVF containing the following:
ItemNumber = Element position of Item as a BIGINT (not converted to INT to eliminate a CAST)
Item = Element value as a VARCHAR(8000)

CROSS APPLY Usage Example:
---------------------------------------------------------------------------------------------------
--===== Conditionally drop the test tables to make reruns easier for testing.
-- (this is NOT a part of the solution)
IF OBJECT_ID('tempdb..#JBMTest') IS NOT NULL
DROP TABLE #JBMTest
;
--===== Create and populate a test table on the fly (this is NOT a part of the solution).
SELECT *
INTO #JBMTest
FROM (
SELECT 1,'1,10,100,1000,10000,100000,1000000' UNION ALL
SELECT 2,'2000000,200000,20000,2000,200,20,2' UNION ALL
SELECT 3, 'This,is,a,test' UNION ALL
SELECT 4, 'and so is this' UNION ALL
SELECT 5, 'This, too (no pun intended)'
) d (SomeID,SomeValue)
;
GO
--===== Split the CSV column for the whole table using CROSS APPLY (this is the solution)
SELECT test.SomeID, split.ItemNumber, split.Item
FROM #JBMTest test
CROSS APPLY
(
SELECT ItemNumber, Item
FROM dbo.DelimitedSplit8k(test.SomeValue,',')
) split
;
---------------------------------------------------------------------------------------------------
Notes:
1. Optimized for VARCHAR(7999) or less. No testing or error reporting for truncation at 7999
characters is done.
2. Optimized for single character delimiter. Multi-character delimiters should be resolved
externally from this function.
3. Optimized for use with CROSS APPLY.
4. Does not "trim" elements just in case leading or trailing blanks are intended.
5. If you don't know how a Tally table can be used to replace loops, please see the following...
http://www.sqlservercentral.com/articles/T-SQL/62867/
6. Changing this function to use VARCHAR(MAX) will cause it to run twice as slow. It's just the
nature of VARCHAR(MAX) whether it fits in-row or not.
7. Multi-machine testing for the method of using UNPIVOT instead of 10 SELECT/UNION ALLs shows
that the UNPIVOT method is quite machine dependent and can slow things down quite a bit.
8. Performance testing shows using "TOP" for the limiting criteria of "N" is actually
slower and slightly more CPU intensive than the traditional WHERE N < LEN(@pString) + 2.
9. Performance testing shows using ORDER BY (SELECT x) where "x" is anything is actually
slower and slightly more CPU intensive than the traditional ORDER BY (SELECT N).

Credits:
This code is the product of many people's efforts including but not limited to the following:
cteTally concept originally by Iztek Ben Gan and "decimalized" by Lynn Pettis (and others) for a
bit of extra speed and finally redacted by Jeff Moden for a different slant on readability and
compactness. Hat's off to Paul White for his simple explanations of CROSS APPLY. Finally,
special thanks to Erland Sommarskog for his tireless efforts to help people understand
what you can actually do with T-SQL. I also thank whoever wrote the first article I ever saw
on "numbers tables" which is located at the following URL ...
http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html

Revision History:
Rev 00 - 20 Jan 2010 - Concept: Lynn Pettis and others.
Redaction/Implementation: Jeff Moden
- Base 10 redaction and reduction for CTE. (Total rewrite)

Rev 01 - 13 Mar 2010 - Jeff Moden
- Removed one additional concatenation and one subtraction from the SUBSTRING in the
SELECT List for that tiny bit of extra speed.

Rev 02 - 14 Apr 2010 - Jeff Moden
- No code changes. Added CROSS APPLY usage example to the header, some additional credits,
and extra documentation.

Rev 03 - 18 Apr 2010 - Jeff Moden
- No code changes. Added notes 7, 8, and 9 about certain "optimizations" that dont'
actually work for this type of function.

Rev 04 - 29 Jun 2010 - Jeff Moden
- Added WITH SCHEMABINDING thanks to a note by Paul White. This prevents an unnecessary
"Table Spool" when the function is used in an UPDATE statement even though the function
makes no external references.
***************************************************************************************************/
--===== Define I/O parameters
(
@pString VARCHAR(7999),
@pDelimiter CHAR(1)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
--===== "Inline" CTE Driven "Tally Table” produces values up to
-- 10,000... enough to cover VARCHAR(8000)
WITH
E1(N) AS ( --=== Create Ten 1's
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)
--===== Do the split
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@pString, N, CHARINDEX(@pDelimiter, @pString + @pDelimiter, N) - N) AS Item
FROM cteTally
WHERE N <= DATALENGTH(@pString)+1 --DATATLENGTH allows for trailing space delimiters
AND SUBSTRING(@pDelimiter + @pString, N, 1) = @pDelimiter
;
GO




--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1035664
Posted Thursday, December 16, 2010 3:51 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, August 30, 2012 7:58 AM
Points: 35, Visits: 122
hi,
thanks for your advice.surely i will not use caps hereafter,and also thank you for the solution u gave to my scenario.
Post #1035736
Posted Thursday, December 16, 2010 6:55 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:45 PM
Points: 35,342, Visits: 31,875
hemanth T (12/16/2010)
hi,
thanks for your advice.surely i will not use caps hereafter,and also thank you for the solution u gave to my scenario.


Thanks for the feedback, hemanth. And since you're relatively new here, welcome aboard!


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1035801
Posted Thursday, December 16, 2010 7:48 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 12:12 PM
Points: 2,049, Visits: 3,587
Hey Jeff - I certainly like that function better than most that I have seen and used. One question I do have though; have you noticed that it always seems to return an estimated number of rows being 100 when looking at the execution plan? It may be the two examples that I have tested so far but the one I had passed in 158 and in the second < 20. It does far better than other functions in that they always return an estimated rowcount of 1 which really botches the execution plan when you have a large string parsed.

I can start this as another thread but wondered if you have noticed this or something similar.


David

@SQLTentmaker
SQL Tentmaker
“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
Post #1035845
Posted Thursday, December 16, 2010 9:18 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 8:45 PM
Points: 35,342, Visits: 31,875
David Benoit (12/16/2010)
Hey Jeff - I certainly like that function better than most that I have seen and used. One question I do have though; have you noticed that it always seems to return an estimated number of rows being 100 when looking at the execution plan? It may be the two examples that I have tested so far but the one I had passed in 158 and in the second < 20. It does far better than other functions in that they always return an estimated rowcount of 1 which really botches the execution plan when you have a large string parsed.

I can start this as another thread but wondered if you have noticed this or something similar.


That IS interesting. I never actually bothered to look because it's never been a performance problem on VARCHAR(8000) for me. I believe it's because E1 has real data so it can figure out E2 will produce 100 rows if not filtered but I won't know until I take a deeper dive on the execution plan.

As a side bar... I also noticed that there's still some nasty implicit conversions in there because of ROW_NUMBER being a BIGINT. I'll have to get to playing with those one of these days.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1036299
Posted Friday, December 17, 2010 6:14 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 12:12 PM
Points: 2,049, Visits: 3,587
Yeah, I will have to play with it a bit more as well. Also want to play with a tally table version to see what it does.

One other thing I noticed / ran into (similar to a wall) was that you can't use a hash join hint when joining directly to the function inline. I'm assuming that is due to the fact that it is ordered on ItemNumber and you are joining to Item but haven't played with it to figure that out yet either. I'll let you know when / if I do. :)

Thanks again for this. Definitely helps!


David

@SQLTentmaker
SQL Tentmaker
“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot
Post #1036483
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse