Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Split input string into multicolumn - multirows


Split input string into multicolumn - multirows

Author
Message
murthyvs
murthyvs
Valued Member
Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)

Group: General Forum Members
Points: 62 Visits: 86
Hello all - I am having hard time to split an input string into multicolumn - multirows.

Task - Create a stored procedure that reads an input string with pre-defined field and row terminators; splits the string into multicolumn - multirows; and inserts records into a table.

The table structure is predefined:

create table #temp (field1 varchar(200), field2 varchar(200));

Here is an example input string: (the input string is of variable length; the fields are variable lengths as well)
30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|

";" = field terminator
"|" = row terminator

The stored proc should insert the values in this fashion:
field1 field2
30 38469
31 38470
32 38471
33 38472
34 38473
35 38474
36 38475
37 38476
38 38477

Storing the input string as a text file is not an option. Life would have been much better if I could save the text file and use bulk insert!!

Any help is greatly appreciated. Thanks!!
nigelrivett
nigelrivett
Old Hand
Old Hand (396 reputation)Old Hand (396 reputation)Old Hand (396 reputation)Old Hand (396 reputation)Old Hand (396 reputation)Old Hand (396 reputation)Old Hand (396 reputation)Old Hand (396 reputation)

Group: General Forum Members
Points: 396 Visits: 211
This will do something like that.
It caters for any number of fields - just amend the final select for the number required

declare @s varchar(8000)
select @s = '30;38469;1|31;38470;1|32;38471;1|33;38472;1|34;38473;1|35;38474;1|36;38475;1|37;38476;1|38;38477;1|'
select @s = '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|'

declare @rowterm varchar(1) = '|'
declare @fieldterm varchar(1) = ';'

;with cte as
(
select rowstrt = 1, rowend = charindex(@rowterm,@s)-1, seq = 1
union all
select rowstrt = rowend+2, rowend = charindex(@rowterm,@s,rowend+2)-1, seq = seq + 1
from cte where charindex(@rowterm,@s,rowend+2)<>0
) ,
cte2 as
(
select s = substring(@s, rowstrt,rowend-rowstrt+1), seq
from cte
) ,
cte3 as
(
select fldstrt = 1, fldend = charindex(@fieldterm,s)-1, seq, s, fldseq = 1 from cte2
union all
select fldstrt = fldend+2,
fldend = case when charindex(@fieldterm,s,fldend+2) <> 0 then charindex(@fieldterm,s,fldend+2)-1 else len(s) end,
seq, s, fldseq = fldseq+1
from cte3 where fldend < len(s)
)
select s1.s, s2.s, s3.s
from
(select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 1) s1
join (select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 2) s2 on s1.seq = s2.seq
left join (select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 3) s3 on s1.seq = s3.seq


Cursors never.
DTS - only when needed and never to control.

Cadavre
Cadavre
SSCrazy
SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)

Group: General Forum Members
Points: 2596 Visits: 8437
My hope is that Jeff Moden won't take um-bridge to my ripping apart of his code. . . :-D

I've based this answer on the 8K splitter, with some modifications to make it split on multiple deliminators.

IF EXISTS (SELECT 1
FROM sysobjects
WHERE id = OBJECT_ID('SSC_Multi_Split') AND OBJECTPROPERTY(id, 'IsProcedure') = 1)
BEGIN
DROP PROCEDURE SSC_Multi_Split;
END
GO
CREATE PROCEDURE SSC_Multi_Split (@string VARCHAR(8000), @rowDeliminater CHAR(1), @colDeliminater CHAR(1))
AS
BEGIN
IF object_id('tempdb..#temporaryResultHolder') IS NOT NULL
BEGIN
DROP TABLE #temporaryResultHolder;
END;

WITH CTE1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
CTE2(N) AS (SELECT 1 FROM CTE1 X CROSS JOIN CTE1 Y),
CTE3(N) AS (SELECT 1 FROM CTE2 X CROSS JOIN CTE2 Y),
CTE4(N) AS (SELECT 1 FROM CTE3 X CROSS JOIN CTE3 Y),
CTE5(N) AS (SELECT 1 FROM CTE4 X CROSS JOIN CTE4 Y),
CTE6(N) AS (SELECT 1 FROM CTE5 X CROSS JOIN CTE5 Y),
TALLY(N) AS (SELECT TOP (ISNULL(DATALENGTH(@string),0))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM CTE6),
RowStart(start) AS (SELECT 1 UNION ALL
SELECT N+1
FROM TALLY
WHERE SUBSTRING(@string,N,1) = @rowDeliminater
),
RowLen(start,size) AS (SELECT start,
ISNULL(NULLIF(CHARINDEX(@rowDeliminater,@string,start),0)-start,ISNULL(DATALENGTH(@string),8000))
FROM RowStart
),
RowData(rowNumber,data) AS (SELECT ROW_NUMBER() OVER(ORDER BY start),
SUBSTRING(@string, start, size)
FROM RowLen
),
ColumnStart(start, rowNumber) AS (SELECT b.N, b.rowNumber
FROM (SELECT N+1, rowNumber
FROM TALLY
CROSS APPLY (SELECT rowNumber,data
FROM RowData) b
WHERE SUBSTRING(data,N,1) = @colDeliminater
)a(N, rowNumber)
CROSS APPLY (SELECT 1, rowNumber UNION ALL SELECT N, rowNumber)b(N, rowNumber)
),
ColumnLen(rowNumber,start,size,data) AS (SELECT b.rowNumber, start,
ISNULL(NULLIF(CHARINDEX(@colDeliminater,data,start),0)-start,ISNULL(DATALENGTH(data),8000)),
data
FROM ColumnStart a
CROSS APPLY (SELECT rowNumber,data
FROM RowData
WHERE a.rowNumber = rowNumber) b
),
allData(rowNumber,columnNumber,data) AS (SELECT rowNumber, ROW_NUMBER() OVER(PARTITION BY rowNumber ORDER BY start),
SUBSTRING(data, start, size)
FROM ColumnLen
)
SELECT rowNumber,columnNumber,data
INTO #temporaryResultHolder
FROM allData;

DECLARE @SQL NVARCHAR(MAX);

WITH CTE1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
CTE2(N) AS (SELECT 1 FROM CTE1 X CROSS JOIN CTE1 Y),
CTE3(N) AS (SELECT 1 FROM CTE2 X CROSS JOIN CTE2 Y),
CTE4(N) AS (SELECT 1 FROM CTE3 X CROSS JOIN CTE3 Y),
CTE5(N) AS (SELECT 1 FROM CTE4 X CROSS JOIN CTE4 Y),
CTE6(N) AS (SELECT 1 FROM CTE5 X CROSS JOIN CTE5 Y),
TALLY(N) AS (SELECT TOP (ISNULL(DATALENGTH(@string),0))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM CTE6)
SELECT @SQL = STUFF((SELECT ','+CHAR(13)+CHAR(10)+'MAX(CASE WHEN columnNumber = '+CAST(N AS VARCHAR(20))+' THEN data ELSE '+CHAR(39)+CHAR(39)+' END) AS field'+CAST(N AS VARCHAR(20))
FROM TALLY
CROSS APPLY (SELECT TOP 1 columnNumber
FROM #temporaryResultHolder
ORDER BY columnNumber DESC) b
WHERE columnNumber >= N
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'),1,3,'');

SELECT @SQL = 'SELECT '+@SQL+CHAR(13)+CHAR(10)+'FROM #temporaryResultHolder'+CHAR(13)+CHAR(10)+'GROUP BY rowNumber'+CHAR(13)+CHAR(10)+'ORDER BY rowNumber';

EXECUTE sp_executeSQL @SQL;
END
GO



To test: -
EXECUTE SSC_Multi_Split '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|','|',';';



Which returns: -
field1  field2
------ -------
30 38469
31 38470
32 38471
33 38472
34 38473
35 38474
36 38475
37 38476
38 38477



Forever trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


Craig Wilkinson - Software Engineer
LinkedIn
murthyvs
murthyvs
Valued Member
Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)

Group: General Forum Members
Points: 62 Visits: 86
Thanks much, nigelrivett .. this is a great help! I could understand the code. Really appreciate it.
murthyvs
murthyvs
Valued Member
Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)

Group: General Forum Members
Points: 62 Visits: 86
Thanks Cadavre. This little too complicated for a newbie. But I'll go through the code and understand the beauty of it. Appreciate your quick reply.
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24245 Visits: 37978
Just some food for thought:



declare @TestString varchar(8000) = '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|';
select
max(case ds2.ItemNumber when 1 then ds2.Item else '' end) as field1,
max(case ds2.ItemNumber when 2 then ds2.Item else '' end) as field2
from
dbo.DelimitedSplit8K(@TestString,'|') ds1
cross apply dbo.DelimitedSplit8K(ds1.Item,';') ds2
where
ds1.Item <> ''
group by
ds1.ItemNumber;




Uses Jeff's DelimitedSplit8K function as is.

You will find the code for the dbo.DelimitedSplit8K function here: http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Cadavre
Cadavre
SSCrazy
SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)

Group: General Forum Members
Points: 2596 Visits: 8437
Lynn Pettis (12/11/2012)
Just some food for thought:



declare @TestString varchar(8000) = '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|';
select
max(case ds2.ItemNumber when 1 then ds2.Item else '' end) as field1,
max(case ds2.ItemNumber when 2 then ds2.Item else '' end) as field2
from
dbo.DelimitedSplit8K(@TestString,'|') ds1
cross apply dbo.DelimitedSplit8K(ds1.Item,';') ds2
where
ds1.Item <> ''
group by
ds1.ItemNumber;




Uses Jeff's DelimitedSplit8K function as is.

You will find the code for the dbo.DelimitedSplit8K function here: http://www.sqlservercentral.com/articles/Tally+Table/72993/.


You're right, that's a much better idea. To make it work for the OPs requirements, I'd stick it in a sproc to generate the result-set dynamically so that the number of "field" doesn't need to be known before-hand.

So first, we need Jeff's splitter: -
IF EXISTS (SELECT 1
FROM sysobjects
WHERE id = OBJECT_ID('DelimitedSplit8K'))
BEGIN
DROP FUNCTION [dbo].[DelimitedSplit8K];
END
GO
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
GO



Then we need the calling sproc: -
IF EXISTS (SELECT 1
FROM sysobjects
WHERE id = OBJECT_ID('SSC_Multi_Split'))
BEGIN
DROP PROCEDURE SSC_Multi_Split;
END
GO
CREATE PROCEDURE SSC_Multi_Split (@string VARCHAR(8000), @rowDeliminater CHAR(1), @colDeliminater CHAR(1))
AS
BEGIN
IF object_id('tempdb..#temporaryResultHolder') IS NOT NULL
BEGIN
DROP TABLE #temporaryResultHolder;
END;

SELECT ds1.ItemNumber AS rowNumber, ds2.ItemNumber AS columnNumber, ds2.Item AS data
INTO #temporaryResultHolder
FROM dbo.DelimitedSplit8K(@string, @rowDeliminater) ds1
CROSS APPLY dbo.DelimitedSplit8K(ds1.Item, @colDeliminater) ds2
WHERE ds1.Item <> '';

DECLARE @SQL NVARCHAR(MAX);

WITH CTE1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
CTE2(N) AS (SELECT 1 FROM CTE1 X CROSS JOIN CTE1 Y),
CTE3(N) AS (SELECT 1 FROM CTE2 X CROSS JOIN CTE2 Y),
CTE4(N) AS (SELECT 1 FROM CTE3 X CROSS JOIN CTE3 Y),
CTE5(N) AS (SELECT 1 FROM CTE4 X CROSS JOIN CTE4 Y),
CTE6(N) AS (SELECT 1 FROM CTE5 X CROSS JOIN CTE5 Y),
TALLY(N) AS (SELECT TOP (ISNULL(DATALENGTH(@string),0))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM CTE6)
SELECT @SQL = STUFF((SELECT ','+CHAR(13)+CHAR(10)+'MAX(CASE WHEN columnNumber = '+CAST(N AS VARCHAR(20))+' THEN data ELSE '+CHAR(39)+CHAR(39)+' END) AS field'+CAST(N AS VARCHAR(20))
FROM TALLY
CROSS APPLY (SELECT TOP 1 columnNumber
FROM #temporaryResultHolder
ORDER BY columnNumber DESC) b
WHERE columnNumber >= N
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'),1,3,'');

SELECT @SQL = 'SELECT '+@SQL+CHAR(13)+CHAR(10)+'FROM #temporaryResultHolder'+CHAR(13)+CHAR(10)+'GROUP BY rowNumber'+CHAR(13)+CHAR(10)+'ORDER BY rowNumber';

EXECUTE sp_executeSQL @SQL;
END
GO



And we execute as follows: -
EXECUTE SSC_Multi_Split '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|','|',';';




Forever trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


Craig Wilkinson - Software Engineer
LinkedIn
murthyvs
murthyvs
Valued Member
Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)

Group: General Forum Members
Points: 62 Visits: 86
Thanks Cadavre. This is cool stuff.

Can your code take two row terminators? pipe (|) and line break ()?

Example data:
field1;field2|
30;38469|
31;38469|
32;38469|
33;38469|
34;38469|

I didn't notice the line break in the original data. Very sorry.

The intent is that my user copies example data from excel and enters into a multiline text box in a webpage. He then hits a button that would transfer the input string (example data) to a stored proc that reads the data and splits into multiple column - multiple rows.

Thanks for all your help.
Cadavre
Cadavre
SSCrazy
SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)

Group: General Forum Members
Points: 2596 Visits: 8437
murthyvs (12/11/2012)
Thanks Cadavre. This is cool stuff.

Can your code take two row terminators? pipe (|) and line break ()?

Example data:
field1;field2|
30;38469|
31;38469|
32;38469|
33;38469|
34;38469|

I didn't notice the line break in the original data. Very sorry.

The intent is that my user copies example data from excel and enters into a multiline text box in a webpage. He then hits a button that would transfer the input string (example data) to a stored proc that reads the data and splits into multiple column - multiple rows.

Thanks for all your help.


Why not just remove the line breaks?

DECLARE @exampleData VARCHAR(8000) = '30;38469|
31;38469|
32;38469|
33;38469|
34;38469|';

SET @exampleData = REPLACE(REPLACE(@exampleData,CHAR(13),''),CHAR(10),'');

EXECUTE SSC_Multi_Split @exampleData,'|',';'




Forever trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


Craig Wilkinson - Software Engineer
LinkedIn
murthyvs
murthyvs
Valued Member
Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)Valued Member (62 reputation)

Group: General Forum Members
Points: 62 Visits: 86
Awesome .. thanks!!
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search