Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12345»»»

Split input string into multicolumn - multirows Expand / Collapse
Author
Message
Posted Tuesday, December 11, 2012 12:53 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 4:26 PM
Points: 23, Visits: 56
Hello all - I am having hard time to split an input string into multicolumn - multirows.

Task - Create a stored procedure that reads an input string with pre-defined field and row terminators; splits the string into multicolumn - multirows; and inserts records into a table.

The table structure is predefined:

create table #temp (field1 varchar(200), field2 varchar(200));

Here is an example input string: (the input string is of variable length; the fields are variable lengths as well)
30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|

";" = field terminator
"|" = row terminator

The stored proc should insert the values in this fashion:
field1 field2
30 38469
31 38470
32 38471
33 38472
34 38473
35 38474
36 38475
37 38476
38 38477

Storing the input string as a text file is not an option. Life would have been much better if I could save the text file and use bulk insert!!

Any help is greatly appreciated. Thanks!!
Post #1394921
Posted Tuesday, December 11, 2012 1:54 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, December 13, 2013 4:35 AM
Points: 386, Visits: 211
This will do something like that.
It caters for any number of fields - just amend the final select for the number required

declare @s varchar(8000)
select @s = '30;38469;1|31;38470;1|32;38471;1|33;38472;1|34;38473;1|35;38474;1|36;38475;1|37;38476;1|38;38477;1|'
select @s = '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|'

declare @rowterm varchar(1) = '|'
declare @fieldterm varchar(1) = ';'

;with cte as
(
select rowstrt = 1, rowend = charindex(@rowterm,@s)-1, seq = 1
union all
select rowstrt = rowend+2, rowend = charindex(@rowterm,@s,rowend+2)-1, seq = seq + 1
from cte where charindex(@rowterm,@s,rowend+2)<>0
) ,
cte2 as
(
select s = substring(@s, rowstrt,rowend-rowstrt+1), seq
from cte
) ,
cte3 as
(
select fldstrt = 1, fldend = charindex(@fieldterm,s)-1, seq, s, fldseq = 1 from cte2
union all
select fldstrt = fldend+2,
fldend = case when charindex(@fieldterm,s,fldend+2) <> 0 then charindex(@fieldterm,s,fldend+2)-1 else len(s) end,
seq, s, fldseq = fldseq+1
from cte3 where fldend < len(s)
)
select s1.s, s2.s, s3.s
from
(select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 1) s1
join (select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 2) s2 on s1.seq = s2.seq
left join (select seq, s = substring(s,fldstrt,fldend-fldstrt+1) from cte3 where fldseq = 3) s3 on s1.seq = s3.seq



Cursors never.
DTS - only when needed and never to control.
Post #1394940
Posted Tuesday, December 11, 2012 2:32 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 1:30 AM
Points: 2,386, Visits: 7,622
My hope is that Jeff Moden won't take um-bridge to my ripping apart of his code. . .

I've based this answer on the 8K splitter, with some modifications to make it split on multiple deliminators.

IF EXISTS (SELECT 1
FROM sysobjects
WHERE id = OBJECT_ID('SSC_Multi_Split') AND OBJECTPROPERTY(id, 'IsProcedure') = 1)
BEGIN
DROP PROCEDURE SSC_Multi_Split;
END
GO
CREATE PROCEDURE SSC_Multi_Split (@string VARCHAR(8000), @rowDeliminater CHAR(1), @colDeliminater CHAR(1))
AS
BEGIN
IF object_id('tempdb..#temporaryResultHolder') IS NOT NULL
BEGIN
DROP TABLE #temporaryResultHolder;
END;

WITH CTE1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
CTE2(N) AS (SELECT 1 FROM CTE1 X CROSS JOIN CTE1 Y),
CTE3(N) AS (SELECT 1 FROM CTE2 X CROSS JOIN CTE2 Y),
CTE4(N) AS (SELECT 1 FROM CTE3 X CROSS JOIN CTE3 Y),
CTE5(N) AS (SELECT 1 FROM CTE4 X CROSS JOIN CTE4 Y),
CTE6(N) AS (SELECT 1 FROM CTE5 X CROSS JOIN CTE5 Y),
TALLY(N) AS (SELECT TOP (ISNULL(DATALENGTH(@string),0))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM CTE6),
RowStart(start) AS (SELECT 1 UNION ALL
SELECT N+1
FROM TALLY
WHERE SUBSTRING(@string,N,1) = @rowDeliminater
),
RowLen(start,size) AS (SELECT start,
ISNULL(NULLIF(CHARINDEX(@rowDeliminater,@string,start),0)-start,ISNULL(DATALENGTH(@string),8000))
FROM RowStart
),
RowData(rowNumber,data) AS (SELECT ROW_NUMBER() OVER(ORDER BY start),
SUBSTRING(@string, start, size)
FROM RowLen
),
ColumnStart(start, rowNumber) AS (SELECT b.N, b.rowNumber
FROM (SELECT N+1, rowNumber
FROM TALLY
CROSS APPLY (SELECT rowNumber,data
FROM RowData) b
WHERE SUBSTRING(data,N,1) = @colDeliminater
)a(N, rowNumber)
CROSS APPLY (SELECT 1, rowNumber UNION ALL SELECT N, rowNumber)b(N, rowNumber)
),
ColumnLen(rowNumber,start,size,data) AS (SELECT b.rowNumber, start,
ISNULL(NULLIF(CHARINDEX(@colDeliminater,data,start),0)-start,ISNULL(DATALENGTH(data),8000)),
data
FROM ColumnStart a
CROSS APPLY (SELECT rowNumber,data
FROM RowData
WHERE a.rowNumber = rowNumber) b
),
allData(rowNumber,columnNumber,data) AS (SELECT rowNumber, ROW_NUMBER() OVER(PARTITION BY rowNumber ORDER BY start),
SUBSTRING(data, start, size)
FROM ColumnLen
)
SELECT rowNumber,columnNumber,data
INTO #temporaryResultHolder
FROM allData;

DECLARE @SQL NVARCHAR(MAX);

WITH CTE1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
CTE2(N) AS (SELECT 1 FROM CTE1 X CROSS JOIN CTE1 Y),
CTE3(N) AS (SELECT 1 FROM CTE2 X CROSS JOIN CTE2 Y),
CTE4(N) AS (SELECT 1 FROM CTE3 X CROSS JOIN CTE3 Y),
CTE5(N) AS (SELECT 1 FROM CTE4 X CROSS JOIN CTE4 Y),
CTE6(N) AS (SELECT 1 FROM CTE5 X CROSS JOIN CTE5 Y),
TALLY(N) AS (SELECT TOP (ISNULL(DATALENGTH(@string),0))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM CTE6)
SELECT @SQL = STUFF((SELECT ','+CHAR(13)+CHAR(10)+'MAX(CASE WHEN columnNumber = '+CAST(N AS VARCHAR(20))+' THEN data ELSE '+CHAR(39)+CHAR(39)+' END) AS field'+CAST(N AS VARCHAR(20))
FROM TALLY
CROSS APPLY (SELECT TOP 1 columnNumber
FROM #temporaryResultHolder
ORDER BY columnNumber DESC) b
WHERE columnNumber >= N
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'),1,3,'');

SELECT @SQL = 'SELECT '+@SQL+CHAR(13)+CHAR(10)+'FROM #temporaryResultHolder'+CHAR(13)+CHAR(10)+'GROUP BY rowNumber'+CHAR(13)+CHAR(10)+'ORDER BY rowNumber';

EXECUTE sp_executeSQL @SQL;
END
GO

To test: -
EXECUTE SSC_Multi_Split '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|','|',';';

Which returns: -
field1  field2
------ -------
30 38469
31 38470
32 38471
33 38472
34 38473
35 38474
36 38475
37 38476
38 38477



Not a DBA, just trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


LinkedIn | Blog coming soon (for sufficiently large values of "soon" )!
Post #1394960
Posted Tuesday, December 11, 2012 2:46 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 4:26 PM
Points: 23, Visits: 56
Thanks much, nigelrivett .. this is a great help! I could understand the code. Really appreciate it.
Post #1394968
Posted Tuesday, December 11, 2012 2:47 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 4:26 PM
Points: 23, Visits: 56
Thanks Cadavre. This little too complicated for a newbie. But I'll go through the code and understand the beauty of it. Appreciate your quick reply.
Post #1394970
Posted Tuesday, December 11, 2012 4:09 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 10:48 AM
Points: 20,857, Visits: 32,877
Just some food for thought:


declare @TestString varchar(8000) = '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|';
select
max(case ds2.ItemNumber when 1 then ds2.Item else '' end) as field1,
max(case ds2.ItemNumber when 2 then ds2.Item else '' end) as field2
from
dbo.DelimitedSplit8K(@TestString,'|') ds1
cross apply dbo.DelimitedSplit8K(ds1.Item,';') ds2
where
ds1.Item <> ''
group by
ds1.ItemNumber;


Uses Jeff's DelimitedSplit8K function as is.

You will find the code for the dbo.DelimitedSplit8K function here: http://www.sqlservercentral.com/articles/Tally+Table/72993/.



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1395021
Posted Tuesday, December 11, 2012 8:28 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 1:30 AM
Points: 2,386, Visits: 7,622
Lynn Pettis (12/11/2012)
Just some food for thought:


declare @TestString varchar(8000) = '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|';
select
max(case ds2.ItemNumber when 1 then ds2.Item else '' end) as field1,
max(case ds2.ItemNumber when 2 then ds2.Item else '' end) as field2
from
dbo.DelimitedSplit8K(@TestString,'|') ds1
cross apply dbo.DelimitedSplit8K(ds1.Item,';') ds2
where
ds1.Item <> ''
group by
ds1.ItemNumber;


Uses Jeff's DelimitedSplit8K function as is.

You will find the code for the dbo.DelimitedSplit8K function here: http://www.sqlservercentral.com/articles/Tally+Table/72993/.


You're right, that's a much better idea. To make it work for the OPs requirements, I'd stick it in a sproc to generate the result-set dynamically so that the number of "field" doesn't need to be known before-hand.

So first, we need Jeff's splitter: -
IF EXISTS (SELECT 1
FROM sysobjects
WHERE id = OBJECT_ID('DelimitedSplit8K'))
BEGIN
DROP FUNCTION [dbo].[DelimitedSplit8K];
END
GO
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
GO

Then we need the calling sproc: -
IF EXISTS (SELECT 1
FROM sysobjects
WHERE id = OBJECT_ID('SSC_Multi_Split'))
BEGIN
DROP PROCEDURE SSC_Multi_Split;
END
GO
CREATE PROCEDURE SSC_Multi_Split (@string VARCHAR(8000), @rowDeliminater CHAR(1), @colDeliminater CHAR(1))
AS
BEGIN
IF object_id('tempdb..#temporaryResultHolder') IS NOT NULL
BEGIN
DROP TABLE #temporaryResultHolder;
END;

SELECT ds1.ItemNumber AS rowNumber, ds2.ItemNumber AS columnNumber, ds2.Item AS data
INTO #temporaryResultHolder
FROM dbo.DelimitedSplit8K(@string, @rowDeliminater) ds1
CROSS APPLY dbo.DelimitedSplit8K(ds1.Item, @colDeliminater) ds2
WHERE ds1.Item <> '';

DECLARE @SQL NVARCHAR(MAX);

WITH CTE1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),
CTE2(N) AS (SELECT 1 FROM CTE1 X CROSS JOIN CTE1 Y),
CTE3(N) AS (SELECT 1 FROM CTE2 X CROSS JOIN CTE2 Y),
CTE4(N) AS (SELECT 1 FROM CTE3 X CROSS JOIN CTE3 Y),
CTE5(N) AS (SELECT 1 FROM CTE4 X CROSS JOIN CTE4 Y),
CTE6(N) AS (SELECT 1 FROM CTE5 X CROSS JOIN CTE5 Y),
TALLY(N) AS (SELECT TOP (ISNULL(DATALENGTH(@string),0))
ROW_NUMBER() OVER(ORDER BY (SELECT NULL))
FROM CTE6)
SELECT @SQL = STUFF((SELECT ','+CHAR(13)+CHAR(10)+'MAX(CASE WHEN columnNumber = '+CAST(N AS VARCHAR(20))+' THEN data ELSE '+CHAR(39)+CHAR(39)+' END) AS field'+CAST(N AS VARCHAR(20))
FROM TALLY
CROSS APPLY (SELECT TOP 1 columnNumber
FROM #temporaryResultHolder
ORDER BY columnNumber DESC) b
WHERE columnNumber >= N
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)'),1,3,'');

SELECT @SQL = 'SELECT '+@SQL+CHAR(13)+CHAR(10)+'FROM #temporaryResultHolder'+CHAR(13)+CHAR(10)+'GROUP BY rowNumber'+CHAR(13)+CHAR(10)+'ORDER BY rowNumber';

EXECUTE sp_executeSQL @SQL;
END
GO

And we execute as follows: -
EXECUTE SSC_Multi_Split '30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|','|',';';




Not a DBA, just trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


LinkedIn | Blog coming soon (for sufficiently large values of "soon" )!
Post #1395154
Posted Tuesday, December 11, 2012 8:59 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 4:26 PM
Points: 23, Visits: 56
Thanks Cadavre. This is cool stuff.

Can your code take two row terminators? pipe (|) and line break ()?

Example data:
field1;field2|
30;38469|
31;38469|
32;38469|
33;38469|
34;38469|

I didn't notice the line break in the original data. Very sorry.

The intent is that my user copies example data from excel and enters into a multiline text box in a webpage. He then hits a button that would transfer the input string (example data) to a stored proc that reads the data and splits into multiple column - multiple rows.

Thanks for all your help.
Post #1395175
Posted Tuesday, December 11, 2012 9:20 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: 2 days ago @ 1:30 AM
Points: 2,386, Visits: 7,622
murthyvs (12/11/2012)
Thanks Cadavre. This is cool stuff.

Can your code take two row terminators? pipe (|) and line break ()?

Example data:
field1;field2|
30;38469|
31;38469|
32;38469|
33;38469|
34;38469|

I didn't notice the line break in the original data. Very sorry.

The intent is that my user copies example data from excel and enters into a multiline text box in a webpage. He then hits a button that would transfer the input string (example data) to a stored proc that reads the data and splits into multiple column - multiple rows.

Thanks for all your help.


Why not just remove the line breaks?

DECLARE @exampleData VARCHAR(8000) = '30;38469|
31;38469|
32;38469|
33;38469|
34;38469|';

SET @exampleData = REPLACE(REPLACE(@exampleData,CHAR(13),''),CHAR(10),'');

EXECUTE SSC_Multi_Split @exampleData,'|',';'




Not a DBA, just trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


LinkedIn | Blog coming soon (for sufficiently large values of "soon" )!
Post #1395193
Posted Tuesday, December 11, 2012 11:47 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Yesterday @ 4:26 PM
Points: 23, Visits: 56
Awesome .. thanks!!
Post #1395267
« Prev Topic | Next Topic »

Add to briefcase 12345»»»

Permissions Expand / Collapse