SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


split column pipe delimited


split column pipe delimited

Author
Message
DBayliss
DBayliss
SSC Journeyman
SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)

Group: General Forum Members
Points: 94 Visits: 157
Hi All,

I'm having trouble separating a single column into multiple columns where the pipe is the delimiter.

I have enclosed an excel spreadsheet of what I am trying to achieve.

all feedback on the best way to approach this would be welcome.

Thanks


Example

DETAILS_ID DETAILS
EA22749B180C41D09B4CC986D21C8F02 2006-11-28|1900-01-01|1||PL625245|

to

DETAILS_ID Column 1 Column 2 Column 3 Column 4
EA22749B180C41D09B4CC986D21C8F02 28/11/2006 01/01/1900 1 PL625245
Attachments
splitter.xlsx (36 views, 17.00 KB)
Lowell
Lowell
SSC Guru
SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)

Group: General Forum Members
Points: 68214 Visits: 40898
ok here's one solution:

ALTER FUNCTION dbo.fn_parsename
(
@pString VARCHAR(7999),
@pDelimiter CHAR(1),
@Occurrance int
)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @Results VARCHAR(8000)
--===== "Inline" CTE Driven "Tally Table” produces values up to
-- 10,000... enough to cover VARCHAR(8000)
;WITH
E1(N) AS ( --=== Create Ten 1's
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4) ,
--===== Do the split
InterResults
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@pString, N, CHARINDEX(@pDelimiter, @pString + @pDelimiter, N) - N) AS Item
FROM cteTally
WHERE N < LEN(@pString) + 2
AND SUBSTRING(@pDelimiter + @pString, N, 1) = @pDelimiter
)
SELECT @Results = Item FROM InterResults WHERE ItemNumber = @Occurrance
return @Results
END --FUNCTION
GO


CREATE TABLE #SampleData(bigstring varchar(500) )
INSERT INTO #SampleData
SELECT 'EA22749B180C41D09B4CC986D21C8F02 2006-11-28|1900-01-01|1||PL625245|'

SELECT
dbo.fn_parsename(bigstring,'|',1) AS [DETAILS_ID],
dbo.fn_parsename(bigstring,'|',2) AS [Column 1],
dbo.fn_parsename(bigstring,'|',3) AS [Column 2],
dbo.fn_parsename(bigstring,'|',4) AS [Column 3],
dbo.fn_parsename(bigstring,'|',5) AS [Column 4]
FROM #SampleData



Lowell
--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!
dwain.c
dwain.c
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16937 Visits: 6431
Instead of creating a special function for the purpose, I just went with Jeff Moden's community DelimitedSplit8K (http://www.sqlservercentral.com/articles/Tally+Table/72993/) to get this:


DECLARE @T TABLE
(DETAILS_ID VARCHAR(100), DETAILS VARCHAR(100))

INSERT INTO @T
SELECT 'EA22749B180C41D09B4CC986D21C8F02','2006-11-28|1900-01-01|1||PL625245|'

;WITH MySplit AS (
SELECT DETAILS_ID, DETAILS, ItemNumber
,Item=CASE WHEN ItemNumber <= 2 THEN CONVERT(VARCHAR(10), CAST(Item AS DATE), 101) ELSE Item END
FROM @T
CROSS APPLY DelimitedSplit8K(DETAILS, '|'))
SELECT DETAILS_ID
,Col1=MAX(CASE WHEN ItemNumber = 1 THEN Item END)
,Col2=MAX(CASE WHEN ItemNumber = 2 THEN Item END)
,Col3=MAX(CASE WHEN ItemNumber = 3 THEN Item END)
,Col4=MAX(CASE WHEN ItemNumber = 5 THEN Item END)
FROM MySplit a
GROUP BY DETAILS_ID




Edit: Initially I didn't notice you wanted the data in 4 columns so I corrected to a crosstab query.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
DBayliss
DBayliss
SSC Journeyman
SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)SSC Journeyman (94 reputation)

Group: General Forum Members
Points: 94 Visits: 157
thanks for your help

much appreciated.
Lowell
Lowell
SSC Guru
SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)

Group: General Forum Members
Points: 68214 Visits: 40898
ok Dwain, I love these performance tests.

here's a setup comparing my crappy scalar function to your solution.

the DelimitedSplit8K pivot was roughly 4x faster on a million rows: 8 seconds vs 2 seconds, roughly.


--------------------------------------------------------------------------------
CLEAR CACHE SO NEITHER QUERY GETS AN ADVANTAGE
--------------------------------------------------------------------------------
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
========== BASELINE ==========

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 180 ms.
================================================================================
--------------------------------------------------------------------------------
CLEAR CACHE SO NEITHER QUERY GETS AN ADVANTAGE
--------------------------------------------------------------------------------
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
========== Scalar Function ==========

SQL Server Execution Times:
CPU time = 7971 ms, elapsed time = 8242 ms.
================================================================================
--------------------------------------------------------------------------------
CLEAR CACHE SO NEITHER QUERY GETS AN ADVANTAGE
--------------------------------------------------------------------------------
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
========== DelimitedSplit8K Featuring Pivot ==========
Warning: Null value is eliminated by an aggregate or other SET operation.

SQL Server Execution Times:
CPU time = 2200 ms, elapsed time = 2472 ms.




and here is my full test harness, but assuming you already have DelimitedSplit8K and my function from above already installed:

--===== Suppress the auto-display of rowcounts to keep them from being
--===== mistaken as part of the result set.
SET NOCOUNT ON
SET STATISTICS TIME OFF;
--=====Drop temp table if it exists for script repeatability
IF (SELECT OBJECT_ID('Tempdb.dbo.#SampleData')) IS NOT NULL
DROP TABLE #SampleData
--=====Generate a million rows of test data matching our string pattern.
SELECT TOP 1000000
ROW_NUMBER() over (order by T1.object_id) As DETAILS_ID,
REPLACE(CONVERT(varchar(40),newid()) ,'-','')
+ '|'
+ CONVERT(varchar(10),T1.create_date,120)
+ '|'
+ CONVERT(varchar(10),T1.modify_date,120)
+ '|'
+ CONVERT(varchar,T1.object_id)
+ '|'
+ 'PL'
+ CONVERT(VARCHAR,ABS(CHECKSUM(NEWID()))%50000+1)
AS DETAILS
INTO #SampleData
FROM sys.objects T1
CROSS JOIN sys.objects T2

DECLARE @HOLDER VARCHAR(20);

PRINT REPLICATE('-',80);
PRINT 'CLEAR CACHE SO NEITHER QUERY GETS AN ADVANTAGE';
PRINT REPLICATE('-',80);
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;

PRINT '========== BASELINE ==========';
SET STATISTICS TIME ON;
SELECT @HOLDER = COUNT(*)
FROM #SampleData;
SET STATISTICS TIME OFF;
PRINT REPLICATE('=',80);

PRINT REPLICATE('-',80);
PRINT 'CLEAR CACHE SO NEITHER QUERY GETS AN ADVANTAGE';
PRINT REPLICATE('-',80);
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;

PRINT '========== Scalar Function ==========';
SET STATISTICS TIME ON;
SELECT
dbo.fn_parsename(DETAILS,'|',1) AS [DETAILS_ID],
dbo.fn_parsename(DETAILS,'|',2) AS [Column 1],
dbo.fn_parsename(DETAILS,'|',3) AS [Column 2],
dbo.fn_parsename(DETAILS,'|',4) AS [Column 3],
dbo.fn_parsename(DETAILS,'|',5) AS [Column 4]
FROM #SampleData;
SET STATISTICS TIME OFF;

PRINT REPLICATE('=',80);
PRINT REPLICATE('-',80);
PRINT 'CLEAR CACHE SO NEITHER QUERY GETS AN ADVANTAGE';
PRINT REPLICATE('-',80);
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;


PRINT '========== DelimitedSplit8K Featuring Pivot ==========';
SET STATISTICS TIME ON;
;WITH MySplit AS (
SELECT DETAILS_ID, DETAILS, ItemNumber
,Item = CASE WHEN ItemNumber <= 3 AND ItemNumber > 1 THEN CONVERT(VARCHAR(10), CAST(Item AS DATE), 101) ELSE Item END
FROM #SampleData
CROSS APPLY DelimitedSplit8K(DETAILS, '|'))
SELECT DETAILS_ID
,Col1=MAX(CASE WHEN ItemNumber = 1 THEN Item END)
,Col2=MAX(CASE WHEN ItemNumber = 2 THEN Item END)
,Col3=MAX(CASE WHEN ItemNumber = 3 THEN Item END)
,Col4=MAX(CASE WHEN ItemNumber = 5 THEN Item END)
FROM MySplit a
GROUP BY DETAILS_ID

SET STATISTICS TIME OFF;



Lowell
--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!
Lowell
Lowell
SSC Guru
SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)SSC Guru (68K reputation)

Group: General Forum Members
Points: 68214 Visits: 40898
ok, adding WITH SCHEMABINDING to my scalar function cuts half a second off of my performance, so it goes to 7.5 seconds or so instead of over 8 seconds.

DelimitedSplit8K already has the WITH SCHEMABINDING directive.

Lowell
--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!
Cadavre
Cadavre
SSCrazy Eights
SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)SSCrazy Eights (8.6K reputation)

Group: General Forum Members
Points: 8598 Visits: 8490
Lowell (10/16/2012)
ok, adding WITH SCHEMABINDING to my scalar function cuts half a second off of my performance, so it goes to 7.5 seconds or so instead of over 8 seconds.

DelimitedSplit8K already has the WITH SCHEMABINDING directive.


The two of you are not producing the same result set in your test (not tested on the sample data provided by the OP).

========== Scalar Function ==========
DETAILS_ID Column 1 Column 2 Column 3 Column 4
---------------------------------- ------------ ------------ ---------- ----------
D221ACC59D214D9AA7480BDD0186B4F5 2012-04-06 2012-04-06 3 PL32145

========== DelimitedSplit8K Featuring Pivot ==========
DETAILS_ID Col1 Col2 Col3 Col4
----------- ---------------------------------- ------------ --------------- ----------
1 D221ACC59D214D9AA7480BDD0186B4F5 04/06/2012 04/06/2012 PL32145



Forever trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


Craig Wilkinson - Software Engineer
LinkedIn
Tom Thomson
Tom Thomson
One Orange Chip
One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)One Orange Chip (25K reputation)

Group: General Forum Members
Points: 25234 Visits: 12488
Cadavre (10/16/2012)
Lowell (10/16/2012)
ok, adding WITH SCHEMABINDING to my scalar function cuts half a second off of my performance, so it goes to 7.5 seconds or so instead of over 8 seconds.

DelimitedSplit8K already has the WITH SCHEMABINDING directive.


The two of you are not producing the same result set in your test (not tested on the sample data provided by the OP).

That's odd, I get different results from yours with their code. Judging by edit times and posting times, the versions I'm using were in their posts a long time before you posted your results. The results I get are different too: Lowell's code gives dates in form yyyy-mm-dd while dwaion.c's uses mm/dd/yyyy (and the OP's expected result used dd/mm/yyyy) but that's trivia which the OP can easily sort out.
Another difference is that Dwain changed the double pipe to a single pipe so that there would only be 4 split elements while Lowell left it in and ignored the resulting extra split element (4th split element of 5), so they don't both run off quite the same data.

Tom

dwain.c
dwain.c
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16937 Visits: 6431
Lowell (10/16/2012)
ok, adding WITH SCHEMABINDING to my scalar function cuts half a second off of my performance, so it goes to 7.5 seconds or so instead of over 8 seconds.

DelimitedSplit8K already has the WITH SCHEMABINDING directive.


Hey Lowell!

I really wish I had time today to look into this in detail but I fear I will not.

I am wondering if you took into account "measurement" error for your scalar function? Jeff M. covered that topic in his article here: http://www.sqlservercentral.com/articles/T-SQL/91724/. If you didn't, it might work in your favor.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
dwain.c
dwain.c
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16937 Visits: 6431
Cadavre (10/16/2012)
Lowell (10/16/2012)
ok, adding WITH SCHEMABINDING to my scalar function cuts half a second off of my performance, so it goes to 7.5 seconds or so instead of over 8 seconds.

DelimitedSplit8K already has the WITH SCHEMABINDING directive.


The two of you are not producing the same result set in your test (not tested on the sample data provided by the OP).

========== Scalar Function ==========
DETAILS_ID Column 1 Column 2 Column 3 Column 4
---------------------------------- ------------ ------------ ---------- ----------
D221ACC59D214D9AA7480BDD0186B4F5 2012-04-06 2012-04-06 3 PL32145

========== DelimitedSplit8K Featuring Pivot ==========
DETAILS_ID Col1 Col2 Col3 Col4
----------- ---------------------------------- ------------ --------------- ----------
1 D221ACC59D214D9AA7480BDD0186B4F5 04/06/2012 04/06/2012 PL32145


Cadavre - Can you provide the sample data you got this result with?

When I run mine on the OP's sample data I get this:


DETAILS_ID Col1 Col2 Col3 Col4
EA22749B180C41D09B4CC986D21C8F02 11/28/2006 01/01/1900 1 PL625245


BTW. Changing the CONVERT format code to 103 should display the dates in dd/mm/yyyy as the OP needed (my mistake).


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search