Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««23456»»

Split input string into multicolumn - multirows Expand / Collapse
Author
Message
Posted Wednesday, December 19, 2012 8:36 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 8:37 PM
Points: 2,370, Visits: 3,252
mister.magoo (12/19/2012)
And the nice thing is this method works nicely on varchar(MAX)...

I used 2000 rows of 2000 pairs of values to test :

Duration_Avg	Cpu_Avg	Reads_Avg	Writes_Avg	RowCounts_Avg
106994.000000 98561 4196918 36806 4000000


Of course, I couldn't compare this to the other methods because they don't handle MAX in their current forms, but I did test them at just 200 rows/200 pairs per row and they were getting very slow.

I think 107 seconds to split out 4 million rows is quite good though


Very interesting and thanks for the Brad Schultz link.

I have a vague recollection of some recent discussion in Jeff's DelimitedSplit8K article that may have also touched on this approach. That discussion thread is pretty huge but it would be in the last 2-3 pages of it I think. Can't recall if it was suggesting to use this for VARCHAR(MAX) or not though. Maybe something else to investigate or maybe even something you'd like to add to the discussion.



No loops! No CURSORs! No RBAR! Hoo-uh!

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Are you too recursively challenged?
Splitting strings based on patterns can be fast!
Post #1398750
Posted Wednesday, December 19, 2012 9:07 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:54 PM
Points: 33,113, Visits: 27,041
mister.magoo (12/19/2012)
Hi, I had a feeling that I had forgotten something important about the XML method...then I remembered...


N-i-i-i-i-i-c-c-c-c-e-e-e ! ! ! ! !


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1398754
Posted Wednesday, December 19, 2012 10:14 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:54 PM
Points: 33,113, Visits: 27,041
I'm getting some conflicting results on the performance testing. A 10 year old single cpu desktop computer and a laptop probably aren't the best places for me to test on. I'll try to reserve a little time to test these on the server at work today (it's just after midnight here).

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1398772
Posted Wednesday, December 19, 2012 10:17 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 8:37 PM
Points: 2,370, Visits: 3,252
Jeff Moden (12/19/2012)
I'm getting some conflicting results on the performance testing. A 10 year old single cpu desktop computer and a laptop probably aren't the best places for me to test on. I'll try to reserve a little time to test these on the server at work today (it's just after midnight here).


I would be really interested to hear if you've got any ideas why PatternSplitCM is doing reasonably well on the CPU side but sucking so badly on elapsed time.



No loops! No CURSORs! No RBAR! Hoo-uh!

INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Are you too recursively challenged?
Splitting strings based on patterns can be fast!
Post #1398773
Posted Thursday, December 20, 2012 1:34 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 12:24 AM
Points: 1,337, Visits: 4,025
Jeff Moden (12/19/2012)
I'm getting some conflicting results on the performance testing. A 10 year old single cpu desktop computer and a laptop probably aren't the best places for me to test on. I'll try to reserve a little time to test these on the server at work today (it's just after midnight here).


Having read recently about the problems with using set statistics time on, I used a server side trace to measure when I was testing.

I also saw some strange results, mostly from Lynn's solution which, for some reason would sometimes have terrible elapsed time, but I took that to be environmental as it was seemingly random.


MM




Post #1398845
Posted Thursday, December 20, 2012 7:27 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:54 PM
Points: 33,113, Visits: 27,041
mister.magoo (12/20/2012)
Jeff Moden (12/19/2012)
I'm getting some conflicting results on the performance testing. A 10 year old single cpu desktop computer and a laptop probably aren't the best places for me to test on. I'll try to reserve a little time to test these on the server at work today (it's just after midnight here).


Having read recently about the problems with using set statistics time on, I used a server side trace to measure when I was testing.

I also saw some strange results, mostly from Lynn's solution which, for some reason would sometimes have terrible elapsed time, but I took that to be environmental as it was seemingly random.


The other thing on this testing is that the data is vertically homogenous. All of the delimiters are in precisely the same spot on every row and we've seen where that does give certain splitting methods a seeming advantage.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1398966
Posted Friday, December 21, 2012 6:46 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Yesterday @ 9:51 AM
Points: 298, Visits: 1,323
[EDIT FOR CLARIFICATION: When using the DelimitedSplit8K function it makes no sense to use it on strings longer than 8K and it will not perform as well as other methods if the datatype is VARCHAR(MAX). In the code posted below, make sure to change the datatypes to VARCHAR(8000) when the DelimitedSplit8K function is involved. Thanks to Jeff for pointing this out.]

OK...I came up with another variation of the XML splitter function. I also wanted to compare the various methods based on the size of the sample string to parse AND to compare methods when the size of the values in the value pairs was not constant. I wrote a function that created value pairs first as just numerics 0-9 and then one with alphanumerics. Both of these generated rows in which the value pairs were random and different for each row. I've posted the various functions and scripts below along with the time comparisons.

First test used the same data as generated in a post above using a Tally table CTE. I generated test data with 100 rows, 1,000 rows, and 10,000 rows.


IF OBJECT_ID('tempdb..#Strings') IS NOT NULL
DROP TABLE #Strings

CREATE TABLE #Strings (ID INT IDENTITY, MyString VARCHAR(MAX))

;WITH Tally (n) AS (
SELECT TOP 100 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns a CROSS JOIN sys.all_columns b)
INSERT INTO #Strings
SELECT CAST((
SELECT CAST(n AS VARCHAR(5)) + ';' + CAST(40000+n AS VARCHAR(6)) + '|'
FROM Tally
FOR XML PATH('')) AS VARCHAR(MAX))
FROM Tally


The first row looked like this (and all the rows were the same):
MyString
1;40001|2;40002|3;40003| [...] |997;40997|998;40998|999;40999|1000;41000|

Performance:

(100 row(s) affected) ******************************************************
DelimitedSplit8K (Lynn Pettis)
SQL Server Execution Times:
CPU time = 391 ms, elapsed time = 1129 ms.

PatternSplitCM (Dwain.C)
SQL Server Execution Times:
CPU time = 750 ms, elapsed time = 22446 ms.

XML (Mister Magoo)
SQL Server Execution Times:
CPU time = 421 ms, elapsed time = 437 ms.

DelimitedSplit_XML2 (Steven Willis) (Code for the function is below)
SQL Server Execution Times:
CPU time = 32 ms, elapsed time = 23 ms.



(1000 row(s) affected) ******************************************************
DelimitedSplit8K (Lynn Pettis)
SQL Server Execution Times:
(I gave up after 7 minutes)

PatternSplitCM (Dwain.C)
SQL Server Execution Times:
CPU time = 40157 ms, elapsed time = 334170 ms.

XML (Mister Magoo)
SQL Server Execution Times:
CPU time = 39891 ms, elapsed time = 40201 ms.

DelimitedSplit_XML2 (Steven Willis)
SQL Server Execution Times:
CPU time = 1453 ms, elapsed time = 1573 ms.



10000 row(s) affected) ******************************************************
[Other methods all took more than 7 mins which was my "give up" threshold]

DelimitedSplit_XML2 (Steven Willis)
SQL Server Execution Times:
CPU time = 152500 ms, elapsed time = 229167 ms.




The second series of tests used a function to generate random data of different lengths for each item of each value pair and for each row. I generated test data with 100 rows and 1,000 rows. In this test, the results were quite different!


IF OBJECT_ID('tempdb..#Strings') IS NOT NULL
DROP TABLE #Strings

CREATE TABLE #Strings (ID INT IDENTITY, MyString VARCHAR(MAX))

DECLARE @X INT
SET @X = 1

WHILE @X <= 100
BEGIN

INSERT INTO #Strings
SELECT CAST((
SELECT TOP(100)
dbo.svfGenerateUniqueCode
(0,ISNULL(NULLIF(FLOOR(CAST(RAND(t1.N*10000000) AS DECIMAL(18,4))*10),0),10),'')
+ ';' +
dbo.svfGenerateUniqueCode
(0,ISNULL(NULLIF(FLOOR(CAST(RAND((t1.N+100)*10000000) AS DECIMAL(18,4))*10),0),10),'')
+ '|'
FROM
dbo.Tally AS t1
FOR XML PATH('')) AS VARCHAR(MAX))

SET @X = @X+1

END


The test rows looked like this:

MyString (each row a different set of random numbers of variable length which will be different for every run)

ID MyString
1 5061;1552|4237254;407035|27;5|042;617152|6253541;336|3;6|234;646436|1004512557;704|...
2 4535;501356|5561164;47|100;2|0764;6|4041310;051|45;4|47;1621|3;605|2;2477|...
3 270301173;14|60;20751|44273;327607|2444234;663274273|2;404426|4;165211|...

Performance:

(100 row(s) affected) ******************************************************
DelimitedSplit8K (Lynn Pettis)
SQL Server Execution Times:
CPU time = 593 ms, elapsed time = 617 ms.

PatternSplitCM (Dwain.C)
SQL Server Execution Times:
CPU time = 594 ms, elapsed time = 5392 ms.

XML (Mister Magoo)
SQL Server Execution Times:
CPU time = 421 ms, elapsed time = 412 ms.


DelimitedSplit_XML2 (Steven Willis)
SQL Server Execution Times:
CPU time = 625 ms, elapsed time = 640 ms.

(1000 row(s) affected) ******************************************************

DelimitedSplit8K (Lynn Pettis)
SQL Server Execution Times:
CPU time = 4922 ms, elapsed time = 14138 ms.

PatternSplitCM (Dwain.C)
SQL Server Execution Times:
CPU time = 8156 ms, elapsed time = 172964 ms.

XML (Mister Magoo)
SQL Server Execution Times:
CPU time = 4312 ms, elapsed time = 4481 ms.


DelimitedSplit_XML2 (Steven Willis)
SQL Server Execution Times:
CPU time = 6281 ms, elapsed time = 6835 ms.


For random alphanumerics:

MyString (each row a different set of random alphanumerics of variable length which will be different for every run)

ID MyString
1 Uu6xS;e7Gd|cQm;J|DQ;BgU|F4;f|UqEqZx;fWK|E0k1WRan;h|Pz;1fd|...
2 fEsK;mUN20WCD|ehSK;SFNB5|YN5mgLY;QjAVec6|5;fq|R3z;4uUS|...
3 5hnr;6P|Yvqb6Pn;P3r|vW4q;h6G|uuaHXx7Fwq;5|QS;gs7|xgn7K;4cCebC|...


Performance:

(100 row(s) affected) ******************************************************

DelimitedSplit8K (Lynn Pettis)
SQL Server Execution Times:
CPU time = 640 ms, elapsed time = 371 ms.


PatternSplitCM (Dwain.C)
SQL Server Execution Times:
CPU time = 328 ms, elapsed time = 1523 ms.

XML (Mister Magoo)
SQL Server Execution Times:
CPU time = 454 ms, elapsed time = 466 ms.

DelimitedSplit_XML2 (Steven Willis)
SQL Server Execution Times:
CPU time = 578 ms, elapsed time = 585 ms.


(1000 row(s) affected) ******************************************************

DelimitedSplit8K (Lynn Pettis)
SQL Server Execution Times:
CPU time = 5125 ms, elapsed time = 14625 ms.

PatternSplitCM (Dwain.C)
SQL Server Execution Times:
CPU time = 4281 ms, elapsed time = 58210 ms.

XML (Mister Magoo)
SQL Server Execution Times:
CPU time = 4313 ms, elapsed time = 4305 ms.


DelimitedSplit_XML2 (Steven Willis)
SQL Server Execution Times:
CPU time = 5937 ms, elapsed time = 5955 ms.


The test code used to run each test:


DECLARE @Holder1 VARCHAR(MAX), @Holder2 VARCHAR(MAX)

PRINT 'DelimitedSplit8K (Lynn Pettis)'
SET STATISTICS TIME ON
SELECT
@Holder1 = MAX(CASE ds2.ItemNumber
WHEN 1 THEN ds2.Item
ELSE ''
END)
, --as field1,
@Holder2 = MAX(CASE ds2.ItemNumber
WHEN 2 THEN ds2.Item
ELSE ''
END) --as field2
FROM
#Strings
CROSS APPLY dbo.DelimitedSplit8K(MyString,'|') ds1
CROSS APPLY dbo.DelimitedSplit8K(ds1.Item,';') ds2
WHERE
ds1.Item <> ''
GROUP BY
ID
,ds1.ItemNumber ;
SET STATISTICS TIME OFF


PRINT 'PatternSplitCM (Dwain.C)'
SET STATISTICS TIME ON
SELECT
@Holder1 = MAX(CASE ItemNumber % 4
WHEN 1 THEN Item
END)
,@Holder2 = MAX(CASE ItemNumber % 4
WHEN 3 THEN Item
END)
FROM
#Strings
CROSS APPLY PatternSplitCM(MyString,'[0-9]')
WHERE
[Matched] = 1
GROUP BY
ID
,ItemNumber / 4
--OPTION (MAXDOP 1)
SET STATISTICS TIME OFF


PRINT 'XML (Mister Magoo)'
GO
SET STATISTICS TIME ON
GO

DECLARE @Holder1 VARCHAR(MAX), @Holder2 VARCHAR(MAX)

SELECT
@Holder1 = nd.value('./@a','varchar(200)')
,@Holder2 = nd.value('./@b','varchar(200)')
FROM
(
SELECT
CAST('<c a="' + REPLACE(REPLACE(MyString,'|','"/><c a="'),';','" b="') + '"/>' AS XML).query('.')
FROM
#Strings
FOR
XML PATH('')
,TYPE
) AS src (nds)
CROSS APPLY nds.nodes('c') AS x (nd)
WHERE
nd.exist('./@b') = 1

GO

SET STATISTICS TIME OFF
GO



PRINT 'DelimitedSplit_XML2 (Steven Willis)'
GO
SET STATISTICS TIME ON
GO

DECLARE
@Holder3 VARCHAR(MAX)
,@Holder4 VARCHAR(MAX)

SELECT
@Holder3 = Col1
,@Holder4 = Col2
FROM
(
SELECT
ID
,dsk1.Col1
,dsk1.Col2
FROM
#Strings a
CROSS APPLY dbo.tvfDelimitedSplitXML2(MyString,'|',';') AS dsk1
) r1
GO

SET STATISTICS TIME OFF
GO



ANOTHER splitter function:


/* This function will also split 1-dimensional arrays if Delimiter2 is left blank */

CREATE FUNCTION [dbo].[tvfDelimitedSplitXML2]
(
@InputString VARCHAR(MAX)
,@Delimiter1 CHAR(1)
,@Delimiter2 CHAR(1)
)
RETURNS @Elements TABLE
(
ItemNumber INT IDENTITY(1,1) NOT NULL
,Col1 VARCHAR(MAX) NULL
,Col2 VARCHAR(MAX) NULL
,PRIMARY KEY (ItemNumber)
)
WITH SCHEMABINDING
AS
BEGIN

DECLARE
@X VARCHAR(MAX)
,@XML XML
,@NumElements INT
,@Counter INT


IF RIGHT(@InputString,1) = @Delimiter1
SET @InputString = LEFT(@InputString,LEN(@InputString)-1)
IF LEFT(@InputString,1) = @Delimiter1
SET @InputString = RIGHT(@InputString,LEN(@InputString)-1)


IF NULLIF(@Delimiter2,'') IS NULL
BEGIN

-- Parse the 1-dimensional delimited string array
SET @X = '<root><s>' + REPLACE(@InputString,@Delimiter1,'</s><s>')+'</s></root>'

-- Convert the string into XML
SET @XML = CONVERT(XML,@X)

-- Select the rows from the array
INSERT INTO @Elements (Col1)
SELECT
T.c.value('.','VARCHAR(MAX)') AS [Value] FROM @XML.nodes('/root/s') T (c)

END
ELSE
BEGIN

-- Parse the 2-dimensional delimited string array

SET @X = '<s1>' + REPLACE(@InputString,@Delimiter1,'</s2><s1>')+'</s2>'
SET @X = REPLACE(@X,@Delimiter2,'</s1><s2>')
SET @X = '<root>'+@X+'</root>'


-- Convert the string into XML
SET @XML = CONVERT(XML,@X)


-- Insert for each pair in the array

INSERT INTO @Elements
SELECT
Col1
,Col2
FROM
(SELECT
ROW_NUMBER() OVER(ORDER BY Col1) AS RowID1
,Col1
FROM
(SELECT T.c.value('.','VARCHAR(MAX)') AS [Col1] FROM @XML.nodes('/root/s1') T (c)) AS Col1
) AS Col1
INNER JOIN
(SELECT
ROW_NUMBER() OVER(ORDER BY Col2) AS RowID2
,Col2
FROM
(SELECT T.c.value('.','VARCHAR(MAX)') AS [Col2] FROM @XML.nodes('/root/s2') T (c)) AS Col2
) AS Col2
ON Col1.RowID1 = Col2.RowID2

END

RETURN

/*
Usage:

SELECT * FROM dbo.tvfDelimitedSplitXML2(
'30;38469|31;38470|32;38471|33;38472|34;38473|35;38474|36;38475|37;38476|38;38477|'
,'|'
,';'
)
WHERE ItemNumber > 0

SELECT * FROM dbo.tvfDelimitedSplitXML2(
'Uu6xS;e7Gd|cQm;J|DQ;BgU|F4;f|UqEqZx;fWK|E0k1WRan;h|Pz;1fd|wjZchH;meYAb|w1U7uhj;rusCg'
,'|'
,';'
)
WHERE ItemNumber > 0

SELECT * FROM dbo.tvfDelimitedSplitXML2(
'Element01,Element02,Element03,Element04,Element05'
,',','')
WHERE ItemNumber > 0

*/

END



Other code for reference:



--Used to pass 'NEWID()' into the function below

CREATE VIEW [dbo].[vwRandomGUID]
AS
SELECT
NEWID() AS RandomGUID
GO

--Generates random values

CREATE FUNCTION [dbo].[svfGenerateUniqueCode]
(
@CodeMinLength INT
,@CodeMaxLength INT
,@SpecialChar VARCHAR(50)
)
RETURNS VARCHAR(100)
AS
BEGIN

DECLARE @Code VARCHAR(100)

DECLARE @CodeData TABLE
(
CodeChar VARCHAR(1)
)

DECLARE @Num TABLE
(
Digit INT NOT NULL
PRIMARY KEY CLUSTERED
)

IF @CodeMaxLength <= @CodeMinLength
SET @CodeMaxLength = @CodeMinLength + 1


INSERT INTO @Num
(
Digit
)
SELECT
Digit = 0
UNION ALL
SELECT
Digit = 1
UNION ALL
SELECT
Digit = 2
UNION ALL
SELECT
Digit = 3
UNION ALL
SELECT
Digit = 4
UNION ALL
SELECT
Digit = 5
UNION ALL
SELECT
Digit = 6
UNION ALL
SELECT
Digit = 7
UNION ALL
SELECT
Digit = 8
UNION ALL
SELECT
Digit = 9
ORDER BY
1

INSERT INTO @CodeData
(
CodeChar
)
SELECT
CodeChar = SUBSTRING(b.Characters,a.RAND_INT % b.MOD,1)
FROM
(
SELECT
aa.Number
,RAND_INT = ABS(CONVERT(INT,CONVERT(VARBINARY(100),(SELECT RandomGUID FROM dbo.vwRandomGUID))))
FROM
(
SELECT Number = a.Digit + (b.Digit * 10) FROM @Num a CROSS JOIN @Num b
) aa
) a
CROSS JOIN
(
SELECT
MOD = LEN(bb.Characters) - 1
,bb.Characters
FROM
(
SELECT
Characters =
'ABCDEFGHJKLMNPQURSUVWXYZ'
+ 'abcdefghjkmnpqursuvwxyz'
+ '0123456789'
+ @SpecialChar
) bb
) b
ORDER BY
(SELECT RandomGUID FROM dbo.vwRandomGUID)

SELECT
@Code = ''

SELECT
@Code = @Code + CodeChar
FROM
@CodeData

SELECT
@Code =
-- Random length from MIN to MAX Characters
SUBSTRING(@Code,1,@CodeMinLength + (ABS(CONVERT(INT,CONVERT(VARBINARY(100),(SELECT RandomGUID FROM dbo.vwRandomGUID))))) % (@CodeMaxLength - @CodeMinLength + 1))

SET @Code = NULLIF(LTRIM(RTRIM(@Code)),'')

RETURN @Code

END
GO


Post #1399639
Posted Friday, December 21, 2012 8:42 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:54 PM
Points: 33,113, Visits: 27,041
Steven Willis (12/21/2012)
First test used the same data as generated in a post above using a Tally table CTE. I generated test data with 100 rows, 1,000 rows, and 10,000 rows.



There is a major fault with this testing. It's a well advertised fact that the DelimitedSplit8K method doesn't work well with blobs. In fact, any method that internally joins to a blob at the character level is guaranteed to run much slower than other methods such a XML. That fact is even stated in the "Tally OH!" ariticle.

Methods like XML, however, are (with the understanding that I've not had the time to do the testing the code on this thread deservers) typically much slower than the DelimitedSplit8K method on datatypes of VARCHAR(8000) or less because of the required concatenations.



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1399644
Posted Friday, December 21, 2012 8:52 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:54 PM
Points: 33,113, Visits: 27,041
To wit, and it's been proven time and again, if you can use CLR on your server, a CLR splitter is going to be the best way to go on this. It will handle both VARCHAR and NVARCHAR without having to make any special adjustments, runs just fine against blob datatypes, and it runs a bit more than twice as fast as DelimitedSplit8K even when it's used properly.

--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1399645
Posted Friday, December 21, 2012 8:55 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 11:54 PM
Points: 33,113, Visits: 27,041
murthyvs (12/11/2012)
Hello all - I am having hard time to split an input string into multicolumn - multirows.

Task - Create a stored procedure that reads an input string with pre-defined field and row terminators; splits the string into multicolumn - multirows; and inserts records into a table.


So, a couple of questions remain to solve this for you properly...

1. Can you use CLR on your servers or not?
2. If not, what is the guranteed absolute maximum length of your delimited parameters?


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1399646
« Prev Topic | Next Topic »

Add to briefcase «««23456»»

Permissions Expand / Collapse