John, you happen to have come to the right place at the right time...
There are several methods to do this, but for shear performance, nothing beats the method I'm getting ready to show you. How do I know? I'm about 20 hours into some experiments with all the methods for an article I'm writing. You'll also find a lot of people that have also done similar experiments and the following method is absolutely the fastest and certainly one of the easiest.
The first thing you need is a "Tally" or "Numbers" table... I prefer the word "Tally" just because it sounds cooler and describes a bit about what the table is actually used for. It consists of a single very well index column of sequential numbers that start at 1 and go to some number. Depending on what I anticipate the customer needs to be, I make one that has 11k, 14k, 19k, or even a million rows long. The table doesn't take much room compared to the utility it provides... it has dozens of uses and doing "whole table splits" like you need to is just one of the things that the Tally table does very well.
Just because... I'm going to assume that you're only going to use a max of 8000 characters in your CSV column. I have a method that will handle VARCHAR(MAX), but it's probably overkill, here. Using VARCHAR(MAX) also slows things down... sometimes, a lot!
So, without further digression, here's how to make an 11 k row Tally table...
--===== Create and populate the Tally table on the fly
SELECT TOP 11000 --equates to more than 30 years of dates
IDENTITY(INT,1,1) AS N
FROM Master.dbo.SysColumns sc1,
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Allow the general public to use it
GRANT SELECT ON dbo.Tally TO PUBLIC
Now... let's use your example data that you were kind enough to supply to do a "whole table split" using the Tally table... The key here is that we have to have a primary key on the table, so I've changed the values in your ID column to be UNIQUE.
-- Create and populate a test table. THIS IS NOT PART OF THE SOLUTION!
CREATE TABLE Source
ID INT PRIMARY KEY CLUSTERED,
INSERT INTO Source (ID,Data) VALUES (1, 'APPLE, DOG, CAT')
INSERT INTO Source (ID,Data) VALUES (2, 'A,B,C')
-- Split the whole table's Data column all at once
-- This IS the solution
--===== Declare a variable to hold the delimiter
DECLARE @Delim CHAR(1)
SET @Delim = ','
--===== Do the split with a count
DataVal = LTRIM(SUBSTRING(@Delim+s.Data, t.N+1, CHARINDEX(@Delim, s.Data+@Delim, t.N+1)-t.N))
FROM dbo.Tally t
RIGHT OUTER JOIN --Necessary in case SomeCsv is NULL
ON SUBSTRING(@Delim+s.Data, t.N, 1) = @Delim
AND t.N < LEN(@Delim+s.Data)
I could explain in detail just how it works... but then there'd be no sense in writting the article.
By the way, this method will split a million rows with an average of 10 items in the CSV, not including the display time, in just 93 seconds.
Please let me know if that works for ya...
is pronounced ree-bar and is a Modenism for R
First step towards the paradigm shift of writing Set Based code: Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair
How to post code problemsHow to post performance problemsForum FAQs