﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / SQL Server 2005 / T-SQL (SS2K5)  / Convert CSV values in three columns to rows / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Tue, 21 May 2013 07:22:58 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Just thought I'd do a followup on this.  I've managed to fix the problem with the apparent Tally Table slowdown as it approaches 8k.  I've got an article in the works.As a side bar, I sure do appreciate all of the testing that some of you folks went through with me on this thread.  Thank you all again.</description><pubDate>Fri, 01 Apr 2011 18:30:43 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Very cool, Pavel.  Thanks.</description><pubDate>Fri, 08 Oct 2010 06:30:46 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>I've also made a comparison of the CLR methods when using parallel vs forced serial plan. I run it on my Core2 Quad 2.5 GHz. and the CLR beneffits a lot from the parralel plans as shows below test. But even in case of forced serial plan the results are still a lot better than the Tally in my previous tests. Run on the table with 1333 items per line.[code="sql"]--============== PARALLEL PLANS ======GO--CLR fn_SPlitString3DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;SELECT    @RowNum = D.RowNum,    @ItemNumber = V.RowID,    @ItemValue = V.ValueFROM dbo.CsvTest3 DCROSS APPLY dbo.fn_SplitString3(D.CsvParameter, ',', 10) VGO--CLR RegExDECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;SELECT    @RowNum = D.RowNum,    @ItemNumber = V.RowID,    @ItemValue = V.ValueFROM dbo.CsvTest3 DCROSS APPLY dbo.fn_RegExMatches2(D.CsvParameter, '\d+') VGO--============== SERIAL PLANS (FORCED)======GO--CLR fn_SPlitString3DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;SELECT    @RowNum = D.RowNum,    @ItemNumber = V.RowID,    @ItemValue = V.ValueFROM dbo.CsvTest3 DCROSS APPLY dbo.fn_SplitString3(D.CsvParameter, ',', 10) VOPTION(MAXDOP 1)GO--CLR RegExDECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;SELECT    @RowNum = D.RowNum,    @ItemNumber = V.RowID,    @ItemValue = V.ValueFROM dbo.CsvTest3 DCROSS APPLY dbo.fn_RegExMatches2(D.CsvParameter, '\d+') VOPTION(MAXDOP 1)GO[/code]Profiler results:[img]http://www.sqlservercentral.com/Forums/Attachment7322.aspx[/img]Corresponding plans:[img]http://www.sqlservercentral.com/Forums/Attachment7321.aspx[/img]</description><pubDate>Thu, 07 Oct 2010 12:48:05 GMT</pubDate><dc:creator>Pavel Pawlowski</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Jeff Moden (10/7/2010)[/b][hr]Zounds... not sure what just changed but I used your script exactly and I'm coming up on 10 minutes on my 2k Box.  I've got the Actual Execution Plan enabled so we'll see what happens, shortly.I know it's a stupid question but I have to make sure... are the two CLR methods you ran (I can't test them) producing the correct output?[/quote]Jeff,for sure the two CLR method produce correct results. I run simple test query and compare it against the Tally. [code="sql"]WITH TestData(RowNum, CsvParameter) AS(	SELECT 1 as RowNum, '10,8,156,1389,45726,125456' as CsvParameter UNION ALL	SELECT 2 as RowNum, '4578,13,78784,888,45726' as CsvParameter UNION ALL	SELECT 3 as RowNum, '973,576,12547,88877,2111333' as CsvParameter),Split8KTallyM AS (	SELECT		D.RowNum,		V.ItemNumber,		V.ItemValue AS Split8KTallyM	FROM TestData D	CROSS APPLY dbo.Split8KTallyM(D.CsvParameter, ',') V),CLRSplitString3 AS (	SELECT		D.RowNum,		V.RowID as ItemNumber,		V.Value AS CLRSplitString3	FROM TestData D	CROSS APPLY dbo.fn_SplitString3(D.CsvParameter, ',', 10) V),CLRRegEx AS (	SELECT		D.RowNum,		V.RowID AS ItemNumber,		V.Value AS CLRRegEx	FROM TestData D	CROSS APPLY dbo.fn_RegExMatches2(D.CsvParameter, '\d+') V)SELECT	T.RowNum,	T.ItemNumber,	T.Split8KTallyM,	T1.CLRSplitString3,	T2.CLRRegExFROM Split8KTallyM T	FULL OUTER JOIN CLRSplitString3 T1 ON T.RowNum = T1.RowNum AND T.ItemNumber = T1.ItemNumberFULL OUTER JOIN CLRRegEx T2 ON T.RowNum = T2.RowNum AND T.ItemNumber = T2.ItemNumber[/code]and of course here are results:[code="other"]RowNum      ItemNumber  Split8KTallyM CLRSplitString3 CLRRegEx----------- ----------- ------------- --------------- ----------1           1           10            10              101           2           8             8               81           3           156           156             1561           4           1389          1389            13891           5           45726         45726           457261           6           125456        125456          1254562           1           4578          4578            45782           2           13            13              132           3           78784         78784           787842           4           888           888             8882           5           45726         45726           457263           1           973           973             9733           2           576           576             5763           3           12547         12547           125473           4           88877         88877           888773           5           2111333       2111333         2111333[/code]And also modified version of test run on the big CsvTest table with 1333 or other count of items. This version only returns differences agains the Split8KTallyM. If there is no difference and the functions works correctly then we will receive empty result set.[code="sql"]WITH Split8KTallyM AS (	SELECT		D.RowNum,		V.ItemNumber,		V.ItemValue AS Split8KTallyM	FROM CsvTest D	CROSS APPLY dbo.Split8KTallyM(D.CsvParameter, ',') V),CLRSplitString3 AS (	SELECT		D.RowNum,		V.RowID AS ItemNumber,		V.Value AS CLRSplitString3	FROM CsvTest D	CROSS APPLY dbo.fn_SplitString3(D.CsvParameter, ',', 10) V),CLRRegEx AS (	SELECT		D.RowNum,		V.RowID AS ItemNumber,		V.Value AS CLRRegEx	FROM CsvTest D	CROSS APPLY dbo.fn_RegExMatches2(D.CsvParameter, '\d+') V)SELECT	T.RowNum,	T.ItemNumber,	T.Split8KTallyM,	T1.CLRSplitString3,	T2.CLRRegExFROM Split8KTallyM T	FULL OUTER JOIN CLRSplitString3 T1 ON T.RowNum = T1.RowNum AND T.ItemNumber = T1.ItemNumberFULL OUTER JOIN CLRRegEx T2 ON T.RowNum = T2.RowNum AND T.ItemNumber = T2.ItemNumberWHERE 	T.RowNum IS NULL 	OR 	T1.RowNum IS NULL 	OR 	T2.RowNum IS NULL 	OR 	T.Split8KTallyM &amp;lt;&amp;gt; T1.CLRSplitString3	OR	T.Split8KTallyM &amp;lt;&amp;gt; T2.CLRRegEx[/code]And result:[code="other"]RowNum               ItemNumber  Split8KTallyM CLRSplitString3 CLRRegEx-------------------- ----------- ------------- --------------- ----------(0 row(s) affected)[/code]So I can confirm, that both the RegEx and also the fastest function processing the string on character level works correctly and returns correct results.</description><pubDate>Thu, 07 Oct 2010 12:28:27 GMT</pubDate><dc:creator>Pavel Pawlowski</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Ok... here's what I got (using Pavel's code) on my 2005 box which only has 1 CPU (P4 @1.8GHz), 1GB Ram, and an 8 year old hard drive.  Notice that collation does make a huge change.  The place where I changed the collation was within the function itself.My default collation is the SQL Server Default of SQL_Latin1_General_CP1_CI_AS which explains why stuff runs faster on my machine than on a lot of other folks machines. [code="plain"]                   SPID              Dur    CPU     Reads   Writes RowCountSQL:BatchCompleted	55	--Tally Test 862545 737422 14813936 29871  26660000 Latin1_General_CI_ASSQL:BatchCompleted	55	--Tally Test 491414 457594 14819820 29884  26660001 SQL_Latin1_General_CP1_CI_ASSQL:BatchCompleted	55	--Tally Test 470092 455282 14864872 30166  26660000 Latin1_General_BIN[/code]Of course, it's still not going to touch the SQLCLR splitter.  We knew that going into this.</description><pubDate>Thu, 07 Oct 2010 06:36:45 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Zounds... not sure what just changed but I used your script exactly and I'm coming up on 10 minutes on my 2k Box.  I've got the Actual Execution Plan enabled so we'll see what happens, shortly.I know it's a stupid question but I have to make sure... are the two CLR methods you ran (I can't test them) producing the correct output?</description><pubDate>Thu, 07 Oct 2010 05:41:56 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Jeff Moden (10/6/2010)[/b][hr]Thanks, Pavel.  I'll give it a whirl on my 2k5 box.  Are you using 2k5 or 2k8?[/quote]I'm running 2008</description><pubDate>Thu, 07 Oct 2010 00:41:30 GMT</pubDate><dc:creator>Pavel Pawlowski</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Thanks, Pavel.  I'll give it a whirl on my 2k5 box.  Are you using 2k5 or 2k8?</description><pubDate>Wed, 06 Oct 2010 16:01:48 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Jeff Moden (10/5/2010)[/b][hr]If you want (don't worry about the CLR side... we know that's good), post all of the code YOU are using to setup the test, the function, and the Tally Table and I'll try that on my machine to see if I can duplicate the problem you're having.  I have both a 2k5 and 2k8 machine at home to test on now.And, yes... I realize that much of the code may be what's already been posted and much of it may be mine.  Since I'm trying to duplicate your problem, I wanted you to post the actual code you used so I can try and find a difference.Thanks, Pavel.[/quote]Jeff,here is a complete script and also profiler output:[code="sql"]--Create TestDBCREATE DATABASE [TestDB]	COLLATE Latin1_General_CI_ASGO--Use TestDBUSE [TestDB]GO--Create and populate tally tableSELECT TOP 11000	IDENTITY(INT, 1, 1) AS NINTO dbo.TallyFROM sys.all_objects o1, sys.all_objectsGO--Add Clustered Index on Tally tableALTER TABLE dbo.Tally 	ADD CONSTRAINT PK_Tally PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100GO--Create and populate CsvTest table (doesn't matter whether the table has Clustered index or it is simply heap)SELECT TOP (10000) --Controls the number of rows in the test table            ISNULL(ROW_NUMBER() OVER (ORDER BY(SELECT NULL)),0) AS RowNum,    (        SELECT CAST(STUFF( --=== STUFF get's rid of the leading comma                                ( --=== This builds CSV row with a leading comma                                 SELECT TOP (1333) --Controls the number of CSV elements in each row                                            ','+CAST(ABS(CHECKSUM(NEWID()))%100000 AS VARCHAR(10))                FROM dbo.Tally t3      --Classic cross join pseudo-cursor                                  CROSS JOIN dbo.Tally t4 --can produce row sets up 121 million.                                 WHERE t1.N &amp;lt;&amp;gt; t3.N --Without this line, all rows would be the same                                    FOR XML PATH('')                )                ,1,1,'') AS VARCHAR(8000))                ) AS CsvParameterINTO CsvTestFROM dbo.Tally t1        --Classic cross join pseudo-cursor  CROSS JOIN dbo.Tally t2;  --can produce row sets up 121 million.GO--Create Split Tally FunctionCREATE FUNCTION dbo.Split8KTallyM (	@Parameter VARCHAR(8000), 	@Delimiter VARCHAR(1))RETURNS @Result TABLE (ItemNumber INT, ItemValue INT) AS  BEGIN INSERT INTO @Result        (ItemNumber, ItemValue) SELECT CAST(ROW_NUMBER() OVER (ORDER BY N) AS INT) AS ItemNumber,        SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue   FROM dbo.Tally  WHERE N BETWEEN 1 AND LEN(@Parameter)+1    AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma RETURN    END;GO--Tally TestDECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;SELECT    @RowNum = D.RowNum,    @ItemNumber = V.ItemNumber,    @ItemValue = V.ItemValueFROM dbo.CsvTest DCROSS APPLY dbo.Split8KTallyM(D.CsvParameter, ',') VGO[/code][img]http://www.sqlservercentral.com/Forums/Attachment7275.aspx[/img]</description><pubDate>Tue, 05 Oct 2010 13:21:08 GMT</pubDate><dc:creator>Pavel Pawlowski</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>I'm not sure why but the Tally Table method appears (from this angle... can't tell really) to be Cross Joining with the data which, of course, is going to make it terribly slow.  The question is... why?  Again... can't tell from here and it's not happening that way with any of the (apparently) same tests I've run on 2005 or 2008.  At this point, I'm not sure what the two of us are doing differently.I've also lost track a bit of what we've done.  The difference may be in the Tally Table itself (post the CREATE TABLE statement and all indexes you have on it just so I can have a look, please?).  This type of stuff is difficult to TS remotely.If you want (don't worry about the CLR side... we know that's good), post all of the code YOU are using to setup the test, the function, and the Tally Table and I'll try that on my machine to see if I can duplicate the problem you're having.  I have both a 2k5 and 2k8 machine at home to test on now.And, yes... I realize that much of the code may be what's already been posted and much of it may be mine.  Since I'm trying to duplicate your problem, I wanted you to post the actual code you used so I can try and find a difference.Thanks, Pavel.</description><pubDate>Tue, 05 Oct 2010 04:55:21 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Jeff Moden (10/3/2010)[/b][hr]@Pavel,What is the default collation on your machine, please?  And thanks ahead of time for helping me figure this out.[/quote]Jeff,my default collation is Czech_CI_AS, but I've tested this also on DB with Latin1_General_CI_AS collation and the results are the same. Tested it also on Clustered table and also on heap.Maybe the difference is in the query Plan. Don't know your query plan, but on my machine SQL Server chooses paralel plan for the RegEx solution and maybe it will be benefit for it.Here are the Plans. The first paralel one is for the RegEx CLR.[img]http://www.sqlservercentral.com/Forums/Attachment7267.aspx[/img]Also in term of optimizations and speed of CSV Splitting I've created a little small CLR function which is nearly 5 times quicker than the RegEx for CSV splitting and more than 30 times faster than the tally on my box. It can be used for simple CSV with one char delimiter. (The maxLen argument is length of buffer (maximum length of item i the CSV)Here is the source code  and below the results from Profilers.[code="other"]public class StringSplit{    private struct StrRow    {        public StrRow(int rowId, SqlChars value)        {            RowId = rowId;            Value = value;        }        public int RowId;        public SqlChars Value;    }    [SqlFunction(FillRowMethodName = "FillSplitString3")]    public static IEnumerable SplitString3(SqlString sourceString, string delimiter, int maxLen)    {        char[] buffer = new char[maxLen];        char delim = delimiter[0];        int rowNumber = 0;        int chars = 0;        char[] finalString;        foreach (char chr in sourceString.Value)        {            if (chr == delim)            {                finalString = new char[chars];                Array.Copy(buffer, finalString, chars);                yield return new StrRow(++rowNumber, new SqlChars(finalString));                chars = 0;            }            else            {                buffer[chars++] = chr;            }        }        if (chars &amp;gt; 0)        {            finalString = new char[chars];            Array.Copy(buffer, finalString, chars);            yield return new StrRow(++rowNumber, new SqlChars(finalString));        }    }    public static void FillSplitString3(object obj, out int rowId, out SqlChars value)    {        StrRow r = (StrRow)obj;        rowId = r.RowId;        value = r.Value;    }}[/code][code="sql"]CREATE FUNCTION dbo.fn_SplitString3(	@sourceString nvarchar(max),	@delimiter nchar(1),	@maxLen int)returns TABLE (	RowID int,	Value nvarchar(10))ASEXTERNAL NAME [CQLCLR].StringSplit.SplitString3GO[/code]Profiler Output:[img]http://www.sqlservercentral.com/Forums/Attachment7268.aspx[/img]</description><pubDate>Mon, 04 Oct 2010 14:04:49 GMT</pubDate><dc:creator>Pavel Pawlowski</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>@Pavel,What is the default collation on your machine, please?  And thanks ahead of time for helping me figure this out.</description><pubDate>Sun, 03 Oct 2010 12:32:15 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Pavel Pawlowski (10/2/2010)[/b][hr]But even the Split8KTallyM is 8 times slower comparing the CLR RegEx on the hi element counts (1333).Maybe I'm doing somethig wrong or forget something, but I'm not aware of this.[/quote]You're correct.  I'm not sure what or if you're doing anything wrong because the tally table versions I used obviously took a lot less than yours did.  I'll take a look at your code and see if I can figure it out.</description><pubDate>Sun, 03 Oct 2010 12:27:06 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]steve-893342 (10/2/2010)[/b][hr]Aha the 1,333 test:-) Curiously enough I had already carried out the analysis according to your CsvTest but on 10,000 Row X 1,333 Element table. Results available on request[/quote]Heh... I'm not quite sure why you make people request the results.  They'd be useful to the forum even if I had absolutely no interest.  ;-)</description><pubDate>Sun, 03 Oct 2010 12:22:18 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]mishaluba (6/27/2010)[/b][hr]Hello,I have a table with three columns containing CSV values.  I need to normalize them.  The order of the CSV values is meaningful (explained more below)  Here is the sample DDL:[code="other"]CREATE TABLE #t (id int, kpi1 varchar(16), kpi2 varchar(16), kpi3 varchar(16))insert into #t (id, kpi1, kpi2, kpi3)values(1, '0.1,0.11,0.111', '0.2,0.22,0.222', '0.3,0.33,0.333')insert into #t (id, kpi1, kpi2, kpi3)values(2, '0.4,0.44,0.444', '0.5,0.55,0.555', '0.6,0.66,0.666')select *from #t[/code]My desired result would look something like this:id          kpi1                  kpi2                 kpi3                 items1     items2	        items3----------- ---------------- ---------------- ---------------- ---------- ------	-----1           0.1,0.11,0.111   0.2,0.22,0.222   0.3,0.33,0.333   0.1         0.2		0.31           0.1,0.11,0.111   0.2,0.22,0.222   0.3,0.33,0.333   0.11       0.22		0.331           0.1,0.11,0.111   0.2,0.22,0.222   0.3,0.33,0.333   0.111      0.222		0.3332           0.4,0.44,0.444   0.5,0.55,0.555   0.6,0.66,0.666   0.4         0.5		0.62           0.4,0.44,0.444   0.5,0.55,0.555   0.6,0.66,0.666   0.44       0.55		0.662           0.4,0.44,0.444   0.5,0.55,0.555   0.6,0.66,0.666   0.444      0.555		0.666I hope this is clear.  Basically I want to split CSV values in each of the columns without multiplying the number of records every time I do the split, but just to append them to the same id.  Once again, the order of CSV values is meaningful for the subsequent manipulations.  For example: 0.1, 0.2, 0.3 (first values from kpi1, kpi2, kpi3) form a meaningful set of values, that's why I want them to be in the first row.  The next row contains second values from each of the kpi's: 0.11, 0.22, 0.33.  Since I do have an id in my source table and the function returns this id, I thought I could do a JOIN instead of CROSS APPLY, but this gives me a syntax error:[/quote]OK then folks, returning to the original post:-) I have another solution.This approach uses CROSS APPLY and some string manipulation to accomplish the splitting[code="sql"]SELECT id, kpi1, kpi2, kpi3, items1, items2, items3FROM #tCROSS APPLY(SELECT LEFT(kpi1, CHARINDEX(',', kpi1) - 1), LEFT(kpi2, CHARINDEX(',', kpi2) - 1), LEFT(kpi3, CHARINDEX(',', kpi3) - 1)UNION ALL SELECT SUBSTRING(kpi1, CHARINDEX(',', kpi1) + 1, CHARINDEX(',', kpi1, CHARINDEX(',', kpi1) + 1) - CHARINDEX(',', kpi1) - 1), SUBSTRING(kpi2, CHARINDEX(',', kpi2) + 1, CHARINDEX(',', kpi2, CHARINDEX(',', kpi2) + 1) - CHARINDEX(',', kpi2) - 1), SUBSTRING(kpi3, CHARINDEX(',', kpi3) + 1, CHARINDEX(',', kpi3, CHARINDEX(',', kpi3) + 1) - CHARINDEX(',', kpi3) - 1)UNION ALL SELECT RIGHT(kpi1, CHARINDEX(',', REVERSE(kpi1)) - 1), RIGHT(kpi2, CHARINDEX(',', REVERSE(kpi2)) - 1), RIGHT(kpi3, CHARINDEX(',', REVERSE(kpi3)) - 1)) AS Z (items1, items2, items3)[/code]</description><pubDate>Sun, 03 Oct 2010 09:13:42 GMT</pubDate><dc:creator>steve-893342</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Aha the 1,333 test:-) Curiously enough I had already carried out the analysis according to your CsvTest but on 10,000 Row X 1,333 Element table. Results available on request</description><pubDate>Sat, 02 Oct 2010 14:29:02 GMT</pubDate><dc:creator>steve-893342</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Jeff, you are right about the data types, so I've modified the nvarchar to varchar in the test data generator and make a comparison of the Split8KTallyM vs. CLR RegEx. Aso I've made small enhancement to the CLR RegEx by changing input and output datatypes to match closer the input and output data and simplified the Regular expression for the concrete test.As you said, it's alwasy "It Depends".But even the Split8KTallyM is 8 times slower comparing the CLR RegEx on the hi element counts (1333).Maybe I'm doing somethig wrong or forget something, but I'm not aware of this.Here are results from test on my workstation. (both runs against the test data with varchar column instead of nvarchar)[img]http://www.sqlservercentral.com/Forums/Attachment7256.aspx[/img]Here is definition CLR RegEx function  (rest is the same as in previous post)[code="other"]    [SqlFunction(FillRowMethodName = "FillRegExRow")]    public static IEnumerable RegExMatches2(SqlString sourceString, SqlString pattern)    {        Regex r = new Regex(pattern.Value, RegexOptions.Compiled);        int rowId = 0;        int matchId = 0;        foreach (Match m in r.Matches(sourceString.Value))        {            matchId++;            for (int i = 0; i &amp;lt; m.Groups.Count; i++)            {                yield return new RegExRow(++rowId, matchId, i, m.Groups[i].Value);            }        }    }[/code][code="sql"]CREATE FUNCTION [dbo].[fn_RegExMatches2](@sourceString [nvarchar](max), @pattern [nvarchar](10))RETURNS  TABLE (	[rowId] [int] NULL,	[matchId] [int] NULL,	[groupId] [int] NULL,	[value] [nvarchar](10) NULL) WITH EXECUTE AS CALLERAS EXTERNAL NAME [SQLRegEx].[SQLRegEx].[RegExMatches2]GO[/code]And TestQuery[code="sql"]--===  CLR-RegEx Version  ===DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;SELECT    @RowNum = D.RowNum,    @ItemNumber = V.matchId,    @ItemValue = CAST(V.value as int)FROM dbo.CsvTest4 DCROSS APPLY dbo.fn_RegExMatches2(D.CsvParameter, '\d+') VGO--===  Split8KTallyM  ===DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;SELECT    @RowNum = D.RowNum,    @ItemNumber = V.ItemNumber,    @ItemValue = V.ItemValueFROM dbo.CsvTest4 DCROSS APPLY dbo.Split8KTallyM(D.CsvParameter, ',') VGO[/code]</description><pubDate>Sat, 02 Oct 2010 13:59:34 GMT</pubDate><dc:creator>Pavel Pawlowski</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Pavel Pawlowski (9/26/2010)[/b][hr]But the situation differs with increased number of element. Tested it on 1333 elements (nearly 8k). The Tally table version I stopped after 19 minutes of execution time.[/quote]Do you see the two errors we've both made that caused the Tally Table to look bad?  ;-)The Tally Table works by joining to the data at the character level.  So the first performance error occurs because of what happens if you join two columns that are different datatypes... The data is NVARCHAR and the Tally Table splitter is based on VARCHAR.The second "error" is caused by the myth than an iTVF always works as well at high row-counts as it does at low row-counts.  Because of the join at the character level, you actually end up with a much larger cross join between the Tally Table and the data than required for the larger column counts because the optimizer can "see" the splitter code instead of not being able to see it as if it were a scalar function.  In other words, the optimizer makes a bad mistake here.  There are some fixes you can add but let’s not do that for the moment.  I will, however, include mlTVF code to do a Tally Table split below.So... if we do like we would do when tuning any query for performance, we need to make the datatypes match for the join and we need to change to a NON iTVF to get the performance the Tally Table is famous for even at larger numbers of CSV elements.In my tests, the only change I made to the data generator was to change between creating an NVARCHAR column and a VARCHAR column and then ran the tests for 10, 100, and 1333 for each datatype.  I also threw in the NON iTVF function to show that the problem really isn't with the Tally Table... the problem is with what the optimizer chose to do with it all and we had to do a little tuning just like we would with any query...Here are the results from the tests.  As you can see, "It Depends" and "Tuning Prevails".  It clearly demonstrates that the iTVF is very good (in this case) for very low column counts but loses it's mind even to XML splitters at higher column counts especially when the datatype doesn’t match.  It also clearly demonstrates that the mlTVF for the Tally table is slow at very low row counts and blows the doors off both the iTVF and the XML method for the larger row count that you did your testing with.So, it’s not the Tally Table split that’s the problem… it’s how it’s used that’s the problem and, just like any other code in SQL, there are some best practices to follow including the matching of data-types in joins.[img]http://www.sqlservercentral.com/Forums/Attachment7255.aspx[/img]Oh yeah... almost forgot... here's the mlTVF I used...[code="sql"]--===== Tally Table (Split8KTallyM mlTVF) ===============================================================================   DROP FUNCTION dbo.Split8KTallyM; GO CREATE FUNCTION dbo.Split8KTallyM         (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))RETURNS @Result TABLE (ItemNumber INT, ItemValue INT) AS  BEGIN INSERT INTO @Result        (ItemNumber, ItemValue) SELECT CAST(ROW_NUMBER() OVER (ORDER BY N) AS INT) AS ItemNumber,        SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue   FROM dbo.Tally  WHERE N BETWEEN 1 AND LEN(@Parameter)+1    AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma RETURN    END;GO[/code]And, no... I didn't take the time to test an NVARCHAR version of the Tally Table code here.  I'm saving that for the article. :-D </description><pubDate>Sat, 02 Oct 2010 11:37:11 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote]But as Oleg said for larger element count the Tally is not usable.For the 16 element on my machine the Tally runs about 1 se. and the CLR RegEx about 20 sec.For 100 elements the Tally runs 20550 ms and CLR RegEx 26773 ms.But the situation differs with increased number of element. Tested it on 1333 elements (nearly 8k). The Tally table version I stopped after 19 minutes of execution time.[/quote]Thanks for the feedback... I'll run your code and see if I come up with the same results.</description><pubDate>Sat, 02 Oct 2010 00:36:23 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Jeff,I've done some testing about this also on my machine, but instead of the XML version I've tested the CLR RegEx solution.Here is the source for the RegEx CLR Function[code="other"]public class SQLRegEx{    private class RegExRow    {        /// &amp;lt;summary&amp;gt;        /// Private class for passing matches of the RegExMatches to the FillRow method        /// &amp;lt;/summary&amp;gt;        /// &amp;lt;param name="rowId"&amp;gt;ID of the Row&amp;lt;/param&amp;gt;        /// &amp;lt;param name="matchId"&amp;gt;ID of the Match&amp;lt;/param&amp;gt;        /// &amp;lt;param name="groupID"&amp;gt;ID of the Group within the Match&amp;lt;/param&amp;gt;        /// &amp;lt;param name="value"&amp;gt;Value of the particular group&amp;lt;/param&amp;gt;        public RegExRow(int rowId, int matchId, int groupID, string value)        {            RowId = rowId;            MatchId = matchId;            GroupID = groupID;            Value = value;        }        public int RowId;        public int MatchId;        public int GroupID;        public string Value;    }    /// &amp;lt;summary&amp;gt;    /// Applies Regular Expression to the Source strings and return all matches and groups    /// &amp;lt;/summary&amp;gt;    /// &amp;lt;param name="sourceString"&amp;gt;Source string on which the regular expression should be applied&amp;lt;/param&amp;gt;    /// &amp;lt;param name="pattern"&amp;gt;Regular Expression pattern&amp;lt;/param&amp;gt;    /// &amp;lt;returns&amp;gt;Returns list of RegExRows representing the group value&amp;lt;/returns&amp;gt;    [SqlFunction(FillRowMethodName = "FillRegExRow")]    public static IEnumerable RegExMatches(string sourceString, string pattern)    {        Regex r = new Regex(pattern, RegexOptions.Compiled);        int rowId = 0;        int matchId = 0;        foreach (Match m in r.Matches(sourceString))        {            matchId++;            for (int i = 0; i &amp;lt; m.Groups.Count; i++)            {                yield return new RegExRow(++rowId, matchId, i, m.Groups[i].Value);            }        }    }    /// &amp;lt;summary&amp;gt;    /// FillRow method to populate the output table    /// &amp;lt;/summary&amp;gt;    /// &amp;lt;param name="obj"&amp;gt;RegExRow passed as object&amp;lt;/param&amp;gt;    /// &amp;lt;param name="rowId"&amp;gt;ID or the returned row&amp;lt;/param&amp;gt;    /// &amp;lt;param name="matchId"&amp;gt;ID of returned Match&amp;lt;/param&amp;gt;    /// &amp;lt;param name="groupID"&amp;gt;ID of group in the Match&amp;lt;/param&amp;gt;    /// &amp;lt;param name="value"&amp;gt;Value of the Group&amp;lt;/param&amp;gt;    public static void FillRegExRow(Object obj, out int rowId, out int matchId, out int groupID, out SqlChars value)    {        RegExRow r = (RegExRow)obj;        rowId = r.RowId;        matchId = r.MatchId;        groupID = r.GroupID;        value = new SqlChars(r.Value);    }}[/code][code="sql"]CREATE ASSEMBLY [SQLRegEx]AUTHORIZATION [dbo]FROM 'C:\CLR\SQLRegEx.dll'WITH PERMISSION_SET = SAFEGOCREATE FUNCTION [dbo].[fn_RegExMatches](	@sourceString nvarchar(max), --Source string to be processed by regular expression	@pattern nvarchar(4000)) --Regular expression (pattern) to be applied on the source stringRETURNS  TABLE (    [rowId] int, --RowId each row as it's ID    [matchId] int, --ID of particular match (starts from 1)        [groupId] int, --ID of particular group in RegEx match (GroupID = 0) represents a complete match        [value] [nvarchar](4000) --value of the group) WITH EXECUTE AS CALLERAS EXTERNAL NAME [SQLRegEx].[SQLRegEx].[RegExMatches]GO[/code]I've tested it on the test tables you have provided and it's true, that [b]for small number of elements the Tally table solution is unbeatable[/b]. But as Oleg said for larger element count the Tally is not usable.For the 16 element on my machine the Tally runs about 1 se. and the CLR RegEx about 20 sec.For 100 elements the Tally runs 20550 ms and CLR RegEx 26773 ms.But the situation differs with increased number of element. Tested it on 1333 elements (nearly 8k). [b]The Tally table version I stopped after 19 minutes [/b]of execution time.Insetad the CLR RegEx solution tooks 46 seconds to finish.[b]Test query for the CRL RegEx:[/b][code="sql"]DBCC DROPCLEANBUFFERSGODBCC FREEPROCCACHEGOSET STATISTICS IO ONSET STATISTICS TIME ONGODECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;SELECT	@RowNum = D.RowNum,	@ItemNumber = V.matchId,	@ItemValue = Cast(V.value AS int)FROM dbo.CsvTest2 DCROSS APPLY dbo.fn_RegExMatches(D.CsvParameter, '(\d+?)(?:,|$)') VWHERE V.groupId = 1GOSET STATISTICS TIME OFFSET STATISTICS IO OFFGO[/code][b]Here are results for the CLR RegEx:[/b]SQL Server parse and compile time:    CPU time = 0 ms, elapsed time = 33 ms.Table 'CsvTest2'. Scan count 5, logical reads 217, physical reads 5, read-ahead reads 68, lob logical reads 320872, lob physical reads 2500, lob read-ahead reads 170000.Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times:   CPU time = 140775 ms,  elapsed time = 46082 ms.SQL Server parse and compile time:    CPU time = 0 ms, elapsed time = 0 ms.Profiler output:[img]http://lvp5xa.bay.livefilestore.com/y1plSgWF5uIgNehKSP6W_43txO8w3TY0PkzvtkfmtVsGemVlQplAXU2QoCSA64bBtWD17eFW9nEqrH2nYxaTMqm1D-uEGWLChic/CLRSplit_Profiler.png?psid=1[/img]Run on my working machine Core 2 Quand 2.5 GHz * GB RAMSo from my tests the Tally solution is unusable for large arrays splitting. Instead the CLR works perfectly even with very lage arrays.. When tested it on single array of 100 000 elements, the execution time is 745 ms and for array with 1 000 000 elements it's 7.8 sec.Also using the RegEx, there are no AppDomains reloads/restarts as Phill Factor somewhere mentioned when using the String.Split method for splitting arrays. The RegEx doesn't have this kind of problems.</description><pubDate>Sun, 26 Sep 2010 06:21:56 GMT</pubDate><dc:creator>Pavel Pawlowski</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Oleg Netchaev (9/18/2010)[/b][hr]Jeff,So far, I can see that I use the same setup as far as the tally creation is concerned. The collation is of course the same as well. You are correct in not using xml3, it is designed to fail due to the cross apply inside of it. This is what prompted me to modify my answer on ask after I saw Kevan's. My original answer included the same dreaded cross apply, but once I removed it, it became faster than Kevan's tally for sizeable strings. I still deleted the answer from ask altogether cause I respect your opinion (though cannot agree with it yet). I will try to play with settings and also check whether it is possible to tweak the xml a bit. One thing I saw for sure, I and hope that you will agree that with huge strings, xml execution times does not increase much, but tally split sure does. I think that the most relevant test we can have will be with the string size somewhere in the middle, say 4k per record meaning about 400 items to split. More than 800 makes the string greater than 8k in size. I would like to point out that while you do mention apples to apples comparison, the xml function you wrote does use the varchar(max). I will leave it as is though, because xml-ifying the input bloats it somewhat, so let varchar(max) stay in xml function.Oleg[/quote]Hi Oleg,I appreciate the respect and I do wish you'd put your good post back up on Ask.I definitely agree that the Tally table stumbles pretty badly on things above 8k and I'm setting up to test that.  On the apples-to-apples thing, I left VARCHAR(MAX) in the XML code because in order to get anything close to 8k of data, you have to use VARCHAR(MAX) to account for the addition of the 7 character tag information at each delimiter.  Just to be complete, though, I'll try it with just a VARCHAR(8000) at the same 100 elements I've been testing with, so far.I've got some folks participating in the 100 element test right now.  If it works out ok, I'll try to figure out a way to do a bit more automated and extensive testing across a wider range.  I didn't think the Statistics reporting was going to work out as well as it is currently working out and didn't want to bog peoples machines down so much.You WILL be please to know that it's YOUR method in XML-1 in that testing and even though it's a mlTVF, it blows the other XML methods away for performance.  Under 8k, the Tally table still beats up on it pretty good.  Like I said, I'm setting up for &amp;gt;8k testing... it just takes some time to do it all so people don't have to mess with things when they test it.</description><pubDate>Sun, 19 Sep 2010 12:57:23 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Jeff,So far, I can see that I use the same setup as far as the tally creation is concerned. The collation is of course the same as well. You are correct in not using xml3, it is designed to fail due to the cross apply inside of it. This is what prompted me to modify my answer on ask after I saw Kevan's. My original answer included the same dreaded cross apply, but once I removed it, it became faster than Kevan's tally for sizeable strings. I still deleted the answer from ask altogether cause I respect your opinion (though cannot agree with it yet). I will try to play with settings and also check whether it is possible to tweak the xml a bit. One thing I saw for sure, I and hope that you will agree that with huge strings, xml execution times does not increase much, but tally split sure does. I think that the most relevant test we can have will be with the string size somewhere in the middle, say 4k per record meaning about 400 items to split. More than 800 makes the string greater than 8k in size. I would like to point out that while you do mention apples to apples comparison, the xml function you wrote does use the varchar(max). I will leave it as is though, because xml-ifying the input bloats it somewhat, so let varchar(max) stay in xml function.Oleg</description><pubDate>Sat, 18 Sep 2010 10:54:59 GMT</pubDate><dc:creator>Oleg Netchaev</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Oh yeah... I almost forgot.  Here's the default collation on my desktop box...SQL_Latin1_General_CP1_CI_AS... and on my server box...SQL_Latin1_General_CP1_CI_ASWhat we need to find out now is why XML seems to run so nice and fast on your box.  Please let me know what the configuration is so I can try to find one similar to test on.  Thanks.</description><pubDate>Sat, 18 Sep 2010 10:29:08 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Ok, Oleg... not sure what the differences are between your machine and the two that I'm running but the Tally table beats the two XML methods pretty badly on both. First, here's my Tally table setup just so we're talking apples and apples.  Also, make sure you're using my test generator so you're testing for "real life" very random data with commas in different positions in each row.[code="sql"]--===== Do this in a nice safe place that everyone has    USE TempDB;--===================================================================--      Create a Tally table from 1 to 11000--===================================================================--===== Create and populate the Tally table on the fly.     -- This ISNULL function makes the column NOT NULL     -- so we can put a Primary Key on it SELECT TOP 11000        IDENTITY(INT,1,1) AS N   INTO dbo.Tally   FROM Master.sys.ALL_Columns ac1  CROSS JOIN Master.sys.ALL_Columns ac2;--===== Add a CLUSTERED Primary Key to maximize performance  ALTER TABLE dbo.Tally    ADD CONSTRAINT PK_Tally_N         PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100;--===== Allow the general public to use it  GRANT SELECT ON dbo.Tally TO PUBLIC;GO[/code]Here are the run results from my desktop machine (8 year old single P4 1.8GHz 1GB Ram).  Do notice the 100 element test, please.  And, hell no... with speeds like that, there's no way I'm running a 1,000 element test. [img]http://www.sqlservercentral.com/Forums/Attachment7055.aspx[/img]Here are the exact same runs from a real server running 4 32-bit Xeons at 3GHz on Windows 2003 and SQL Server 2005 sp3 with 2GB memory allocated to SQL Server.  Again... notice the 100 element runs.[img]http://www.sqlservercentral.com/Forums/Attachment7056.aspx[/img]I just might try the 1000 element runs on the server (well... probably NOT the XML3 version).</description><pubDate>Sat, 18 Sep 2010 10:24:12 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>My tests DO reflect the total picture... just not on your machine. ;-)  They show the test results on my machine.  It's obvious that my machine is different than yours (mine cannot do parallelism... it only has 1 CPU, for example) so  I'm setting up to do the tests on a more modern machine than my 8 year old desktop.  We'll see soon.Please post the CREATE TABLE statement for your Tally Table.  Be sure to include any indexes it may have.  It would also be helpful to know what your default COLLATION is, please.  We need to make sure we're comparing apples to apples.  Thanks.And to be sure, the only thing I care about right now is splitting things that are 8k or less.  We already know that JOINing to a MAX column is terrible for performance and that a While Loop is one of the ways to go there.</description><pubDate>Sat, 18 Sep 2010 08:05:16 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Jeff,I am terribly sorry, but there is no way that your test reflects the picture. I ran Split8K and xml only tests and results I get are totally different. First, I would like to clarify that you did mean 25 seconds, not 25 minutes to process 10,000 records with 100 items in one record, right? The biggest problem with Split8KTally was pointed out a while ago by Lynn Pettis who correctly said that it becomes pretty slow when the datalength of the items in the record becomes close to 8K. This assessment is absolutely correct, yet you tested a rather small number of items to split in each record (4, 16 etc). When I ran your tests I saw that the smaller number of items to split, the better Split8KTally becomes. So, for 16 items per record I saw Split8kTally takes 1 second while xml takes 4. However, when I increased the records to hold 100 items each, I saw that Split8KTally takes 15 seconds while xml takes 20, still faster but not as much. When I increased the number to 1000 per record, which basically pushed the Split8K to about as much as it can handle then I saw thatSplit8KTally took 4 minutes 39 seconds xml still took 20 seconds.Then I could no longer test Split8KTally (does not handle more than 8K worth of csv data per record), but continued with xml. I increased the number of items to split in each record to 10,000, which made the longest record to have 117,840 characters, and the number of records as a result of cross apply - 100 million (10,000 records with 10,000 csv values in each). This took 45 seconds. This clearly shows that xml handily beats other methods (outside of CLR procedure, which can be created to complete the whole thing in a heartbeat) once the data to split becomes big enough in number of csv values to split.Honestly, I would not even go for all this re-testing, but after I saw the 25 minutes instead of expected 25 seconds coupled with the fact that the WHILE loop (the which should not even exist in T-SQL) can beat my xml solution, I had to do it.Please let me know if my assumptions are incorrect is some way. I used your harness without making any changes.Thanks,Oleg</description><pubDate>Sat, 18 Sep 2010 00:06:05 GMT</pubDate><dc:creator>Oleg Netchaev</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Long story short... I wouldn't use XML for splitting if I were you.  :-)</description><pubDate>Fri, 17 Sep 2010 19:59:49 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Ok... here we go... First, here are 4 functions.  I believe Oleg will recognize the one called "XML-1 (Split8KXML1 mlTVF)"....[code="sql"]--=====================================================================================================================--      Create the various functions to test--=====================================================================================================================--===== Do this in a nice safe place that everyone has    USE TempDB;--===== Tally Table (Split8KTally iTVF) ===============================================================================   DROP FUNCTION dbo.Split8KTally GO CREATE FUNCTION dbo.Split8KTally         (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))RETURNS TABLE AS RETURN SELECT CAST(ROW_NUMBER() OVER (ORDER BY N) AS INT) AS ItemNumber,        SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue   FROM dbo.Tally  WHERE N BETWEEN 1 AND LEN(@Parameter)    AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the commaGO--===== XML-1 (Split8KXML1 mlTVF) =====================================================================================   DROP FUNCTION dbo.Split8KXML1 GO CREATE FUNCTION dbo.Split8KXML1        (@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))RETURNS @Result TABLE         (ItemNumber INT, ItemValue VARCHAR(8000))AS  BEGIN        DECLARE @XML  XML;         SELECT @XML = '&amp;lt;r&amp;gt;'+REPLACE(@Parameter, ',', '&amp;lt;/r&amp;gt;&amp;lt;r&amp;gt;')+'&amp;lt;/r&amp;gt;';         INSERT INTO @Result (ItemNumber, ItemValue)         SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,                Item.value('text()[1]', 'varchar(max)') AS ItemValue           FROM @XML.nodes('//r') R(Item); RETURN;   END;GO--===== XML-3 (Split8KXML3 iTVF) ======================================================================================   DROP FUNCTION dbo.Split8KXML3 GO CREATE FUNCTION dbo.Split8KXML3        (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))RETURNS TABLE WITH SCHEMABINDING AS RETURN SELECT CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS INT) AS ItemNumber,        R.Item.value('text()[1]', 'varchar(max)') AS ItemValue   FROM (SELECT CAST('&amp;lt;r&amp;gt;'+REPLACE(@Parameter, @Delimiter, '&amp;lt;/r&amp;gt;&amp;lt;r&amp;gt;')+'&amp;lt;/r&amp;gt;' AS XML)) X(N)  CROSS APPLY N.nodes('//r') R(Item);GO--===== Loop Method 1 (Split8KL1 mlTVF) ===============================================================================   DROP FUNCTION dbo.Split8KL1 GO CREATE FUNCTION dbo.Split8KL1        (@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))RETURNS @Result TABLE (ItemNumber INT IDENTITY(1,1), ItemValue VARCHAR(8000))     AS  BEGIN--===== Declare a variable to remember the position of the current commaDECLARE @N INT;--===== Add start and end commas to the Parameter so we can handle     -- single elements SELECT @Parameter = @Delimiter + @Parameter + @Delimiter,--===== Preassign the current comma as the first character        @N = 1;--===== Loop through and find each comma, then insert the string value      -- found between the current comma and the next comma.  @N is     -- the position of the current comma.  WHILE @N &amp;lt; LEN(@Parameter) --Don't include the last comma  BEGIN        --==== Do the insert using the value between the commas        INSERT INTO @Result (ItemValue)        SELECT SUBSTRING(@Parameter, @N+1, CHARINDEX(@Delimiter, @Parameter, @N+1)-@N-1);        --==== Find the next comma        SELECT @N = CHARINDEX(@Delimiter, @Parameter, @N+1);    END; --END While RETURN;    END; --END FunctionGO[/code]Next, some test data.  Read the comments for where to make changes to get the data to vary, please...[code="sql"]--=====================================================================================================================--      Build the test data --=====================================================================================================================--===== Do this in a nice safe place that everyone has    USE TempDB;--===== Conditionally drop the test table to make reruns easier     IF OBJECT_ID('TempDB..CsvTest','U') IS NOT NULL        DROP TABLE CsvTest;--===== This creates and populates a test table on the fly containing a     -- sequential column and a randomly generated CSV Parameter column. SELECT TOP (10000) --Controls the number of rows in the test table        ISNULL(ROW_NUMBER() OVER (ORDER BY(SELECT NULL)),0) AS RowNum,        (         SELECT CAST(STUFF( --=== STUFF get's rid of the leading comma                ( --=== This builds CSV row with a leading comma                 SELECT TOP (16) --Controls the number of CSV elements in each row                        ','+CAST(ABS(CHECKSUM(NEWID()))%100000 AS VARCHAR(10))                   FROM dbo.Tally t3      --Classic cross join pseudo-cursor                  CROSS JOIN dbo.Tally t4 --can produce row sets up 121 million.                  WHERE t1.N &amp;lt;&amp;gt; t3.N --Without this line, all rows would be the same                    FOR XML PATH('')                 )                ,1,1,'') AS NVARCHAR(MAX))        ) AS CsvParameter   INTO CsvTest   FROM dbo.Tally t1        --Classic cross join pseudo-cursor  CROSS JOIN dbo.Tally t2;  --can produce row sets up 121 million.--===== Let's add a PK just for grins.  Since it's a temp table, we won't name it.  ALTER TABLE CsvTest    ADD PRIMARY KEY CLUSTERED (RowNum) WITH FILLFACTOR = 100;GO[/code]Last but not least, let's test the functions.  I ran the following for 4 different sets of test data with SQL Profiler running.  You can tell what's what just by reading the following SQL Profiler output.[code="sql"]--=====================================================================================================================--      Run the functions (Profiler turned on for this given SPID)--=====================================================================================================================--===== Do this in a nice safe place that everyone has    USE TempDB;GO--===== Tally Table (Split8KTally iTVF) ===============================================================================DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT; SELECT @RowNum = csv.RowNum, @ItemNumber = split.ItemNumber, @ItemValue = CAST(split.ItemValue AS INT)   FROM CsvTest csv  CROSS APPLY dbo.Split8KTally(csv.CsvParameter,',') AS split;GO--===== XML-1 (Split8KXML1 mlTVF) =====================================================================================DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT; SELECT @RowNum = csv.RowNum, @ItemNumber = split.ItemNumber, @ItemValue = CAST(split.ItemValue AS INT)   FROM CsvTest csv  CROSS APPLY dbo.Split8KXML1(csv.CsvParameter,',') AS split;GO--===== XML-3 (Split8KXML3 iTVF) ======================================================================================DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT; SELECT @RowNum = csv.RowNum, @ItemNumber = split.ItemNumber, @ItemValue = CAST(split.ItemValue AS INT)   FROM CsvTest csv  CROSS APPLY dbo.Split8KXML3(csv.CsvParameter,',') AS split;GO--===== Loop Method 1 (Split8KL1 mlTVF) ===============================================================================DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT; SELECT @RowNum = csv.RowNum, @ItemNumber = split.ItemNumber, @ItemValue = CAST(split.ItemValue AS INT)   FROM CsvTest csv  CROSS APPLY dbo.Split8KL1(csv.CsvParameter,',') AS split;GO[/code]Here's the profiler run.  Even the WHILE Loop beat both versions of the XML.  You don't really want to see the 25 minutes it takes the XML to do 100 elements for 10K Rows.[img]http://www.sqlservercentral.com/Forums/Attachment7049.aspx[/img]The reason why I initially said the XML was looking good is because I'd made a mistake in the creation of my Tally table... I used BIGINT instead of INT and the implicit conversions made a mess of the Tally table solution.  Obviously, I fixed that for this experiment.My machine is an 8 year old single P4 1.8Ghz with 1GB Ram running Windows XP Pro SP3 and SQL Server Developer's Edition 2005 SP3 with no CU's.</description><pubDate>Fri, 17 Sep 2010 19:57:33 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Quatrei.X (8/27/2010)[/b][hrCool! thanks, I needed this... uhmmm, I have a question about this but in a different forum/section (don't wanna double post)...[url]http://www.sqlservercentral.com/Forums/Topic975437-392-1.aspx?Update=1[/url]hope you can look at it. Thanks in advance! :-D[/quote]So how'd I do over there? :-) Did I satisfy your questions?</description><pubDate>Fri, 27 Aug 2010 18:53:46 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Jeff Moden (6/29/2010)[/b][hr][quote][b]WayneS (6/28/2010)[/b][hr][quote][b]lmu92 (6/28/2010)[/b][hr]This thread include a post by WayneS showing the [b][url=http://www.sqlservercentral.com/Forums/FindPost904602.aspx]DelimitedSplit8K[/url][/b] function I'm using at the moment.[/quote]I believe I saw a follow-up post from Jeff where he found that the UNPIVOT (as demonstrated in the above link) turns out to be slower when he ran it on his work prod server several times. Jeff, please let us know if I remember this correctly.[/quote]You did, indeed.  I didn't take the time to analyze "why" but on certain machines with multiple processors, the UNPIVOT method sometimes runs substantially slower.  I also didn't understand that the function you good folks were talking about was the function that I posted.  Here's the latest and greatest with all the documentation and optimizations that I currently use for production code... the documentation in the header is quite substantial. {EDIT} Updated the code below to include the lastest performance  (From yesterday) thanks to Paul White.[code="sql"] CREATE FUNCTION dbo.DelimitedSplit8K/*************************************************************************************************** Purpose: Split a given string at a given delimiter and return a list of the split elements (items). Returns: iTVF containing the following: ItemNumber = Element position of Item as a BIGINT (not converted to INT to eliminate a CAST) Item       = Element value as a VARCHAR(8000) CROSS APPLY Usage Example:-----------------------------------------------------------------------------------------------------===== Conditionally drop the test tables to make reruns easier for testing.     -- (this is NOT a part of the solution)     IF OBJECT_ID('tempdb..#JBMTest') IS NOT NULL         DROP TABLE #JBMTest;--===== Create and populate a test table on the fly (this is NOT a part of the solution). SELECT *   INTO #JBMTest   FROM (         SELECT 1,'1,10,100,1000,10000,100000,1000000' UNION ALL         SELECT 2,'2000000,200000,20000,2000,200,20,2' UNION ALL         SELECT 3, 'This,is,a,test'                    UNION ALL         SELECT 4, 'and so is this'                    UNION ALL         SELECT 5, 'This, too (no pun intended)'        ) d (SomeID,SomeValue);GO--===== Split the CSV column for the whole table using CROSS APPLY (this is the solution) SELECT test.SomeID, split.ItemNumber, split.Item   FROM #JBMTest test  CROSS APPLY        (         SELECT ItemNumber, Item           FROM dbo.DelimitedSplit8k(test.SomeValue,',')        ) split;--------------------------------------------------------------------------------------------------- Notes: 1. Optimized for VARCHAR(7999) or less.  No testing or error reporting for truncation at 7999    characters is done. 2. Optimized for single character delimiter.  Multi-character delimiters should be resolved    externally from this function. 3. Optimized for use with CROSS APPLY. 4. Does not "trim" elements just in case leading or trailing blanks are intended. 5. If you don't know how a Tally table can be used to replace loops, please see the following...    http://www.sqlservercentral.com/articles/T-SQL/62867/ 6. Changing this function to use VARCHAR(MAX) will cause it to run twice as slow.  It's just the     nature of VARCHAR(MAX) whether it fits in-row or not. 7. Multi-machine testing for the method of using UNPIVOT instead of 10 SELECT/UNION ALLs shows    that the UNPIVOT method is quite machine dependent and can slow things down quite a bit. 8. Performance testing shows using "TOP" for the limiting criteria of "N" is actually     slower and slightly more CPU intensive than the traditional WHERE N &amp;lt; LEN(@pString) + 2.  9. Performance testing shows using ORDER BY (SELECT x) where "x" is anything is actually    slower and slightly more CPU intensive than the traditional ORDER BY (SELECT N). Credits: This code is the product of many people's efforts including but not limited to the following: cteTally concept originally by Iztek Ben Gan and "decimalized" by Lynn Pettis (and others) for a bit of extra speed and finally redacted by Jeff Moden for a different slant on readability and  compactness. Hat's off to Paul White for his simple explanations of CROSS APPLY. Finally, special thanks to Erland Sommarskog for his tireless efforts to help people understand what you can actually do with T-SQL. I also thank whoever wrote the first article I ever saw on "numbers tables" which is located at the following URL ... http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html Revision History: Rev 00 - 20 Jan 2010 - Concept: Lynn Pettis and others.                        Redaction/Implementation: Jeff Moden         - Base 10 redaction and reduction for CTE.  (Total rewrite) Rev 01 - 13 Mar 2010 - Jeff Moden        - Removed one additional concatenation and one subtraction from the SUBSTRING in the           SELECT List for that tiny bit of extra speed. Rev 02 - 14 Apr 2010 - Jeff Moden        - No code changes.  Added CROSS APPLY usage example to the header, some additional credits,          and extra documentation. Rev 03 - 18 Apr 2010 - Jeff Moden        - No code changes.  Added notes 7, 8, and 9 about certain "optimizations" that dont'           actually work for this type of function. Rev 04 - 29 Jun 2010 - Jeff Moden        - Added WITH SCHEMABINDING thanks to a note by Paul White.  This prevents an unnecessary          "Table Spool" when the function is used in an UPDATE statement even though the function          makes no external references.***************************************************************************************************/--===== Define I/O parameters        (        @pString    VARCHAR(7999),        @pDelimiter CHAR(1)        )RETURNS TABLE   WITH SCHEMABINDINGASRETURN--===== "Inline" CTE Driven "Tally Table” produces values up to     -- 10,000... enough to cover VARCHAR(8000)WITH      E1(N) AS ( --=== Create Ten 1's                 SELECT 1 UNION ALL SELECT 1 UNION ALL                 SELECT 1 UNION ALL SELECT 1 UNION ALL                 SELECT 1 UNION ALL SELECT 1 UNION ALL                 SELECT 1 UNION ALL SELECT 1 UNION ALL                 SELECT 1 UNION ALL SELECT 1 --10               ),      E2(N) AS (SELECT 1 FROM E1 a, E1 b),   --100      E4(N) AS (SELECT 1 FROM E2 a, E2 b),   --10,000cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)  --===== Do the split SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,        SUBSTRING(@pString, N, CHARINDEX(@pDelimiter, @pString + @pDelimiter, N) - N) AS Item   FROM cteTally  WHERE N &amp;lt; LEN(@pString) + 2    AND SUBSTRING(@pDelimiter + @pString, N, 1) = @pDelimiter;GO[/code][/quote]Cool! thanks, I needed this... uhmmm, I have a question about this but in a different forum/section (don't wanna double post)...[url]http://www.sqlservercentral.com/Forums/Topic975437-392-1.aspx?Update=1[/url]hope you can look at it. Thanks in advance! :-D</description><pubDate>Fri, 27 Aug 2010 01:33:17 GMT</pubDate><dc:creator>Quatrei.X</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>The initial testing I just got done doing looks pretty darned good on a parameter of 100,000 elements, Oleg.  I'm still playing with it.</description><pubDate>Thu, 26 Aug 2010 17:27:41 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]Jeff Moden (7/6/2010)[/b][hr]Thank you for the test harness and your good code, Oleg.  I'll check it out.[/quote]Sorry, Oleg... I've not nbeen able to put in the time to do a deep dive on the code and this certainly deserves int.  I'm not going to be able to get to it for at least another 10 days because I'm going on vacation where I won't have access to a computer with SQL Server on it.--Jeff Moden</description><pubDate>Fri, 09 Jul 2010 19:29:54 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Thank you for the test harness and your good code, Oleg.  I'll check it out.</description><pubDate>Tue, 06 Jul 2010 21:24:40 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Jeff,I would like to ask your opinion on the version of the split I recently wrote. It is implemented as a stored proc and does not have any limitation on the size of the string to split. Performance tests that I ran show that feeding text consisting of 10,000 sentences (dot delimited) to it (about 500,000 characters in total or 1 MB in datalength) takes about 0.3 seconds to execute, and expanding the size 10-fold to 100,000 sentences (5 mln characters in total or 10 MB in datalength) increases the execution time also about 10-fold to a total of 3 seconds. This is on a 2-year old desktop with Windows XP SP3, 2 GB of RAM duo-core CPU. The idea is to utilize the engine's ability to process xml:[code="sql"]create proc dbo.usp_DelimitedSplit(    @text nvarchar(max),     @delimiter char(1),    @entitize bit = 1)asbegin    declare @xml xml;    if @entitize = 1 set @text = (select @text for xml path(''));    set @xml = '&amp;lt;r&amp;gt;' + replace(@text, @delimiter, '&amp;lt;/r&amp;gt;&amp;lt;r&amp;gt;') + '&amp;lt;/r&amp;gt;';    select        row_number() over (order by (select null)) item_number,        item.value('text()[1]', 'varchar(max)') item_value        from @xml.nodes('//r') R(item);end;go[/code]As you can see, the procedure is very simple. The optional @entitize parameter deserves an explanation though. It is there to ensure that if the text contains some "xml unfriendly" characters then these are entitized by default. However, it leaves the option to the calling code to pass 0 as @entitize value should the caller be aware that the text has already undergone entitizing treatment prior to the procedure execution. For example,[code="sql"]declare @delimiter char(1);declare @text nvarchar(max);select @delimiter = '.', @text = 'a&amp;lt;x&amp;gt;&amp;z.b.c';exec dbo.usp_DelimitedSplit @text, @delimiter;[/code]automatically handles unfriendly characters returning [code="sql"]item_number          item_value-------------------------------1                    a&amp;lt;x&amp;gt;&amp;z2                    b3                    c[/code]Here is the test I used to start from scratch, pad the stirng with 10,000 sentences and then call the procedure to split it. The number 10000 is hard-coded, but can be changed to whatever arbitrary number:[code="sql"]declare @delimiter char(1);declare @text nvarchar(max);set @delimiter = '.';-- populate @text like this-- "This is a&amp;lt;&amp;&amp;gt;/&amp;gt;&amp;lt;x&amp;gt;&amp; part of the text to split into rows_000001."-- "This is a&amp;lt;&amp;&amp;gt;/&amp;gt;&amp;lt;x&amp;gt;&amp; part of the text to split into rows_000002." etcset    @text = stuff(cast(    (        select top 10000             @delimiter + 'This is a&amp;lt;&amp;&amp;gt;/&amp;gt;&amp;lt;x&amp;gt;&amp; part of the text to split into rows_' +             replace(str(row_number() over(order by a.[object_id]), 6), ' ', '0')            from sys.objects a cross join sys.objects b            for xml path('')    ) as varchar(max)), 1, 1, '');-- because the @text happens to be already entitized,-- pass 0 as @entitize parameter valueexec dbo.usp_DelimitedSplit @text, @delimiter, 0;[/code]The above returns:[code="sql"]item_number          item_value----------------------------------------------------------------------------------1                    This is a&amp;lt;&amp;&amp;gt;/&amp;gt;&amp;lt;x&amp;gt;&amp; part of the text to split into rows_0000012                    This is a&amp;lt;&amp;&amp;gt;/&amp;gt;&amp;lt;x&amp;gt;&amp; part of the text to split into rows_000002/* ... abridged ...*/10000                This is a&amp;lt;&amp;&amp;gt;/&amp;gt;&amp;lt;x&amp;gt;&amp; part of the text to split into rows_010000[/code]Thanks,Oleg</description><pubDate>Tue, 06 Jul 2010 17:54:39 GMT</pubDate><dc:creator>Oleg Netchaev</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]scott.pletcher (6/29/2010)[/b][hr]So, the inline CTE has better performance than a tally table?  Interesting.  Sometimes SQL's performance traits are a bit quirky.[/quote]In other functions, I've not found that to be true.  The Tally Table normally squeaks past the CTE for duration but the CTE gens virtually no reads.  I guess it depends on how you want to impact the system.  When I get around to the Tally Table article rewrite, I'll include million row testing for both methods.</description><pubDate>Tue, 29 Jun 2010 18:03:13 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]WayneS (6/29/2010)[/b][hr]Sigh... I sure wish we could put into the briefcase an individual post on the forum... This one of Jeff's certainly would be in it.@Jeff - 1. Did you test after adding the WITH SCHEMABINDING? If not, I see a test in my near future. ;-)2. I see you changed the input parameter from varchar(8000) to varchar(7999), but I don't see a reason for that in your notes. Would you elaborate on this please?Edit:3. Is it safe to assume that a properly configured (Clustered index, 100% FillFactor) real tally table would perform better than the inline one (the comment Scott made below brings this up)?[/quote]1.  I didn't test this one specifically for the WITH SCHEMABINDING because every other test I ran with schemabinding showed either no change or a minor improvement and I'm also being a bit lazy... I wanted to post the function and didn't have the time to do the normal million row testing on this specific function.  I did so without much fear because the tests on other functions show that WITH SCHEMABINDING is one of those things you can pretty much take to the bank.  You know me... I'll get to retesting it soon, though.2.  I changed the input parameter to 7999 as a clue to folks that you have to save space to include at least 1 extra delimiter in the code (@pDelimiter + @pString and @pString + @pDelimiter).  I guess I should spell out the reason in note 1.3.  Again, I've not specifically tested for this specific function.  In all other functions that I've test, both the Tally Table and the Tally CTE perform equally well although the Tally CTE does it virtually without reads.  Again, I'll be doing more testing especially since I made the WITH SCHEMABINDING change.As far as a brief case goes... once I get done retesting the function, I aim to include it in an update on the Tally Table article.</description><pubDate>Tue, 29 Jun 2010 18:00:09 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]WayneS (6/29/2010)[/b][hr]2. I see you changed the input parameter from varchar(8000) to varchar(7999), but I don't see a reason for that in your notes. Would you elaborate on this please?[/quote]I think I can answer that one ...He does a @pString + @pDelimiter to search for the @pDelimiter (to guarantee it will be found) and that total searched expression is limited to 8000 characters.I read the other thread where Paul was discussing his findings.  I was talking to someone at work about it and would like to find it again if anyone happens to know the link.*edit:  Nevermind about the link, found it [url=http://www.sqlservercentral.com/Forums/Topic910545-338-1.aspx]here[/url].</description><pubDate>Tue, 29 Jun 2010 17:56:21 GMT</pubDate><dc:creator>bteraberry</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>So, the inline CTE has better performance than a tally table?  Interesting.  Sometimes SQL's performance traits are a bit quirky.</description><pubDate>Tue, 29 Jun 2010 15:38:07 GMT</pubDate><dc:creator>scott.pletcher</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>Sigh... I sure wish we could put into the briefcase an individual post on the forum... This one of Jeff's certainly would be in it.@Jeff - 1. Did you test after adding the WITH SCHEMABINDING? If not, I see a test in my near future. ;-)2. I see you changed the input parameter from varchar(8000) to varchar(7999), but I don't see a reason for that in your notes. Would you elaborate on this please?Edit:3. Is it safe to assume that a properly configured (Clustered index, 100% FillFactor) real tally table would perform better than the inline one (the comment Scott made below brings this up)?</description><pubDate>Tue, 29 Jun 2010 07:38:13 GMT</pubDate><dc:creator>WayneS</dc:creator></item><item><title>RE: Convert CSV values in three columns to rows</title><link>http://www.sqlservercentral.com/Forums/Topic943562-338-1.aspx</link><description>[quote][b]WayneS (6/28/2010)[/b][hr][quote][b]lmu92 (6/28/2010)[/b][hr]This thread include a post by WayneS showing the [b][url=http://www.sqlservercentral.com/Forums/FindPost904602.aspx]DelimitedSplit8K[/url][/b] function I'm using at the moment.[/quote]I believe I saw a follow-up post from Jeff where he found that the UNPIVOT (as demonstrated in the above link) turns out to be slower when he ran it on his work prod server several times. Jeff, please let us know if I remember this correctly.[/quote]You did, indeed.  I didn't take the time to analyze "why" but on certain machines with multiple processors, the UNPIVOT method sometimes runs substantially slower.  I also didn't understand that the function you good folks were talking about was the function that I posted.  Here's the latest and greatest with all the documentation and optimizations that I currently use for production code... the documentation in the header is quite substantial. {EDIT} Updated the code below to include the lastest performance  (From yesterday) thanks to Paul White.[code="sql"] CREATE FUNCTION dbo.DelimitedSplit8K/*************************************************************************************************** Purpose: Split a given string at a given delimiter and return a list of the split elements (items). Returns: iTVF containing the following: ItemNumber = Element position of Item as a BIGINT (not converted to INT to eliminate a CAST) Item       = Element value as a VARCHAR(8000) CROSS APPLY Usage Example:-----------------------------------------------------------------------------------------------------===== Conditionally drop the test tables to make reruns easier for testing.     -- (this is NOT a part of the solution)     IF OBJECT_ID('tempdb..#JBMTest') IS NOT NULL         DROP TABLE #JBMTest;--===== Create and populate a test table on the fly (this is NOT a part of the solution). SELECT *   INTO #JBMTest   FROM (         SELECT 1,'1,10,100,1000,10000,100000,1000000' UNION ALL         SELECT 2,'2000000,200000,20000,2000,200,20,2' UNION ALL         SELECT 3, 'This,is,a,test'                    UNION ALL         SELECT 4, 'and so is this'                    UNION ALL         SELECT 5, 'This, too (no pun intended)'        ) d (SomeID,SomeValue);GO--===== Split the CSV column for the whole table using CROSS APPLY (this is the solution) SELECT test.SomeID, split.ItemNumber, split.Item   FROM #JBMTest test  CROSS APPLY        (         SELECT ItemNumber, Item           FROM dbo.DelimitedSplit8k(test.SomeValue,',')        ) split;--------------------------------------------------------------------------------------------------- Notes: 1. Optimized for VARCHAR(7999) or less.  No testing or error reporting for truncation at 7999    characters is done. 2. Optimized for single character delimiter.  Multi-character delimiters should be resolved    externally from this function. 3. Optimized for use with CROSS APPLY. 4. Does not "trim" elements just in case leading or trailing blanks are intended. 5. If you don't know how a Tally table can be used to replace loops, please see the following...    http://www.sqlservercentral.com/articles/T-SQL/62867/ 6. Changing this function to use VARCHAR(MAX) will cause it to run twice as slow.  It's just the     nature of VARCHAR(MAX) whether it fits in-row or not. 7. Multi-machine testing for the method of using UNPIVOT instead of 10 SELECT/UNION ALLs shows    that the UNPIVOT method is quite machine dependent and can slow things down quite a bit. 8. Performance testing shows using "TOP" for the limiting criteria of "N" is actually     slower and slightly more CPU intensive than the traditional WHERE N &amp;lt; LEN(@pString) + 2.  9. Performance testing shows using ORDER BY (SELECT x) where "x" is anything is actually    slower and slightly more CPU intensive than the traditional ORDER BY (SELECT N). Credits: This code is the product of many people's efforts including but not limited to the following: cteTally concept originally by Iztek Ben Gan and "decimalized" by Lynn Pettis (and others) for a bit of extra speed and finally redacted by Jeff Moden for a different slant on readability and  compactness. Hat's off to Paul White for his simple explanations of CROSS APPLY. Finally, special thanks to Erland Sommarskog for his tireless efforts to help people understand what you can actually do with T-SQL. I also thank whoever wrote the first article I ever saw on "numbers tables" which is located at the following URL ... http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-numbers-table.html Revision History: Rev 00 - 20 Jan 2010 - Concept: Lynn Pettis and others.                        Redaction/Implementation: Jeff Moden         - Base 10 redaction and reduction for CTE.  (Total rewrite) Rev 01 - 13 Mar 2010 - Jeff Moden        - Removed one additional concatenation and one subtraction from the SUBSTRING in the           SELECT List for that tiny bit of extra speed. Rev 02 - 14 Apr 2010 - Jeff Moden        - No code changes.  Added CROSS APPLY usage example to the header, some additional credits,          and extra documentation. Rev 03 - 18 Apr 2010 - Jeff Moden        - No code changes.  Added notes 7, 8, and 9 about certain "optimizations" that dont'           actually work for this type of function. Rev 04 - 29 Jun 2010 - Jeff Moden        - Added WITH SCHEMABINDING thanks to a note by Paul White.  This prevents an unnecessary          "Table Spool" when the function is used in an UPDATE statement even though the function          makes no external references.***************************************************************************************************/--===== Define I/O parameters        (        @pString    VARCHAR(7999),        @pDelimiter CHAR(1)        )RETURNS TABLE   WITH SCHEMABINDINGASRETURN--===== "Inline" CTE Driven "Tally Table” produces values up to     -- 10,000... enough to cover VARCHAR(8000)WITH      E1(N) AS ( --=== Create Ten 1's                 SELECT 1 UNION ALL SELECT 1 UNION ALL                 SELECT 1 UNION ALL SELECT 1 UNION ALL                 SELECT 1 UNION ALL SELECT 1 UNION ALL                 SELECT 1 UNION ALL SELECT 1 UNION ALL                 SELECT 1 UNION ALL SELECT 1 --10               ),      E2(N) AS (SELECT 1 FROM E1 a, E1 b),   --100      E4(N) AS (SELECT 1 FROM E2 a, E2 b),   --10,000cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4)  --===== Do the split SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,        SUBSTRING(@pString, N, CHARINDEX(@pDelimiter, @pString + @pDelimiter, N) - N) AS Item   FROM cteTally  WHERE N &amp;lt; LEN(@pString) + 2    AND SUBSTRING(@pDelimiter + @pString, N, 1) = @pDelimiter;GO[/code]</description><pubDate>Tue, 29 Jun 2010 06:38:06 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item></channel></rss>