Ok... here we go...
First, here are 4 functions. I believe Oleg will recognize the one called "XML-1 (Split8KXML1 mlTVF)"....
--=====================================================================================================================
-- Create the various functions to test
--=====================================================================================================================
--===== Do this in a nice safe place that everyone has
USE TempDB;
--===== Tally Table (Split8KTally iTVF) ===============================================================================
DROP FUNCTION dbo.Split8KTally
GO
CREATE FUNCTION dbo.Split8KTally
(@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))
RETURNS TABLE AS
RETURN
SELECT CAST(ROW_NUMBER() OVER (ORDER BY N) AS INT) AS ItemNumber,
SUBSTRING(@Parameter,N,CHARINDEX(@Delimiter,@Parameter+@Delimiter,N)-N) AS ItemValue
FROM dbo.Tally
WHERE N BETWEEN 1 AND LEN(@Parameter)
AND SUBSTRING(@Delimiter+@Parameter,N,1) = @Delimiter; --Notice how we find the comma
GO
--===== XML-1 (Split8KXML1 mlTVF) =====================================================================================
DROP FUNCTION dbo.Split8KXML1
GO
CREATE FUNCTION dbo.Split8KXML1
(@Parameter VARCHAR(MAX), @Delimiter VARCHAR(1))
RETURNS @Result TABLE
(ItemNumber INT, ItemValue VARCHAR(8000))AS
BEGIN
DECLARE @XML XML;
SELECT @XML = '<r>'+REPLACE(@Parameter, ',', '</r><r>')+'</r>';
INSERT INTO @Result (ItemNumber, ItemValue)
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS ItemNumber,
Item.value('text()[1]', 'varchar(max)') AS ItemValue
FROM @XML.nodes('//r') R(Item);
RETURN;
END;
GO
--===== XML-3 (Split8KXML3 iTVF) ======================================================================================
DROP FUNCTION dbo.Split8KXML3
GO
CREATE FUNCTION dbo.Split8KXML3
(@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
SELECT CAST(ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS INT) AS ItemNumber,
R.Item.value('text()[1]', 'varchar(max)') AS ItemValue
FROM (SELECT CAST('<r>'+REPLACE(@Parameter, @Delimiter, '</r><r>')+'</r>' AS XML)) X(N)
CROSS APPLY N.nodes('//r') R(Item)
;
GO
--===== Loop Method 1 (Split8KL1 mlTVF) ===============================================================================
DROP FUNCTION dbo.Split8KL1
GO
CREATE FUNCTION dbo.Split8KL1
(@Parameter VARCHAR(8000), @Delimiter VARCHAR(1))
RETURNS @Result TABLE (ItemNumber INT IDENTITY(1,1), ItemValue VARCHAR(8000))
AS
BEGIN
--===== Declare a variable to remember the position of the current comma
DECLARE @N INT;
--===== Add start and end commas to the Parameter so we can handle
-- single elements
SELECT @Parameter = @Delimiter + @Parameter + @Delimiter,
--===== Preassign the current comma as the first character
@N = 1;
--===== Loop through and find each comma, then insert the string value
-- found between the current comma and the next comma. @N is
-- the position of the current comma.
WHILE @N < LEN(@Parameter) --Don't include the last comma
BEGIN
--==== Do the insert using the value between the commas
INSERT INTO @Result (ItemValue)
SELECT SUBSTRING(@Parameter, @N+1, CHARINDEX(@Delimiter, @Parameter, @N+1)-@N-1);
--==== Find the next comma
SELECT @N = CHARINDEX(@Delimiter, @Parameter, @N+1);
END; --END While
RETURN;
END; --END Function
GO
Next, some test data. Read the comments for where to make changes to get the data to vary, please...
--=====================================================================================================================
-- Build the test data
--=====================================================================================================================
--===== Do this in a nice safe place that everyone has
USE TempDB;
--===== Conditionally drop the test table to make reruns easier
IF OBJECT_ID('TempDB..CsvTest','U') IS NOT NULL
DROP TABLE CsvTest;
--===== This creates and populates a test table on the fly containing a
-- sequential column and a randomly generated CSV Parameter column.
SELECT TOP (10000) --Controls the number of rows in the test table
ISNULL(ROW_NUMBER() OVER (ORDER BY(SELECT NULL)),0) AS RowNum,
(
SELECT CAST(STUFF( --=== STUFF get's rid of the leading comma
( --=== This builds CSV row with a leading comma
SELECT TOP (16) --Controls the number of CSV elements in each row
','+CAST(ABS(CHECKSUM(NEWID()))%100000 AS VARCHAR(10))
FROM dbo.Tally t3 --Classic cross join pseudo-cursor
CROSS JOIN dbo.Tally t4 --can produce row sets up 121 million.
WHERE t1.N <> t3.N --Without this line, all rows would be the same
FOR XML PATH('')
)
,1,1,'') AS NVARCHAR(MAX))
) AS CsvParameter
INTO CsvTest
FROM dbo.Tally t1 --Classic cross join pseudo-cursor
CROSS JOIN dbo.Tally t2; --can produce row sets up 121 million.
--===== Let's add a PK just for grins. Since it's a temp table, we won't name it.
ALTER TABLE CsvTest
ADD PRIMARY KEY CLUSTERED (RowNum) WITH FILLFACTOR = 100;
GO
Last but not least, let's test the functions. I ran the following for 4 different sets of test data with SQL Profiler running. You can tell what's what just by reading the following SQL Profiler output.
--=====================================================================================================================
-- Run the functions (Profiler turned on for this given SPID)
--=====================================================================================================================
--===== Do this in a nice safe place that everyone has
USE TempDB;
GO
--===== Tally Table (Split8KTally iTVF) ===============================================================================
DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;
SELECT @RowNum = csv.RowNum, @ItemNumber = split.ItemNumber, @ItemValue = CAST(split.ItemValue AS INT)
FROM CsvTest csv
CROSS APPLY dbo.Split8KTally(csv.CsvParameter,',') AS split;
GO
--===== XML-1 (Split8KXML1 mlTVF) =====================================================================================
DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;
SELECT @RowNum = csv.RowNum, @ItemNumber = split.ItemNumber, @ItemValue = CAST(split.ItemValue AS INT)
FROM CsvTest csv
CROSS APPLY dbo.Split8KXML1(csv.CsvParameter,',') AS split;
GO
--===== XML-3 (Split8KXML3 iTVF) ======================================================================================
DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;
SELECT @RowNum = csv.RowNum, @ItemNumber = split.ItemNumber, @ItemValue = CAST(split.ItemValue AS INT)
FROM CsvTest csv
CROSS APPLY dbo.Split8KXML3(csv.CsvParameter,',') AS split;
GO
--===== Loop Method 1 (Split8KL1 mlTVF) ===============================================================================
DECLARE @RowNum INT, @ItemNumber INT, @ItemValue INT;
SELECT @RowNum = csv.RowNum, @ItemNumber = split.ItemNumber, @ItemValue = CAST(split.ItemValue AS INT)
FROM CsvTest csv
CROSS APPLY dbo.Split8KL1(csv.CsvParameter,',') AS split;
GO
Here's the profiler run. Even the WHILE Loop beat both versions of the XML. You don't really want to see the 25 minutes it takes the XML to do 100 elements for 10K Rows.
The reason why I initially said the XML was looking good is because I'd made a mistake in the creation of my Tally table... I used BIGINT instead of INT and the implicit conversions made a mess of the Tally table solution. Obviously, I fixed that for this experiment.
My machine is an 8 year old single P4 1.8Ghz with 1GB Ram running Windows XP Pro SP3 and SQL Server Developer's Edition 2005 SP3 with no CU's.
--Jeff Moden
Change is inevitable... Change for the better is not.