Tally OH! An Improved SQL 8K “CSV Splitter” Function

  • Jeff Moden (3/15/2012)


    peter-757102 (3/15/2012)


    Jeff, did you receive any news regarding the performance characteristics of the code (or variations) on SQL Server 2012?

    And on the extra bright side, in a few months I will be working in a .NET driven development environment, then it is time for me to do some experiments with use of that in SQL Server :).

    Haven't tried it there yet but I don't see why any of this would change.

    So far as the .Net driven development environment goes, take a look at the CLR splitter that Paul White wrote for the comparision testing in this article.

    It certainly could be quite different if one of the following internals have been modified in SQL Server 2012 versus 2008:

    * string handling

    * memory management

    * query optimizer changes (CTE/operator stuff)

    Also the difference in performance for varchar(max) and varchar types might be gone as there is no good reason for large difference there from a pure technical perspective. As for .Net coding, I will certainly look at Paul White's splitter. To begin with as a good source to learn from. I haven't done anything .Net much, nor recent.

    By the way, did I miss your long string splitter version somehow, or was that work never finished?

  • Actually, there is a good reason for the MAX datatypes to have a performance problem with such splitters. They're simply handled differenently even when they're "in row" and because of that, they don't cotton to be joined to.

    I've not gotten around to writting an article for the "long splitter" so, no, you didn't miss anything.

    And I suspect that no matter what happens to the internals, the cteTally method in the article will continue to be faster than using the other types of splitters simply because all the methods would be affected by the things you suggest could change. Of course, I could be wrong, as well. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I am not debating whether it will be the faster method or not, it's simple well thought out contruction will make sure it is. But it can potentially have much better scaling to larger strings if certain internals have been improved.

  • That would certainly be nice.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I am running your splitter tests on a SQL Server running in a VM. It's taking forever!

    I let it go for 17 hours earlier this week, and finally tried canceling the query. But it wouldn't stop. So I killed the process. It was in KILLED/ROLLBACK state for 2 more days and furthermore blocking itself (!). I had to restart the server.

    I tried again, starting the splitter tests 5 hours ago. What do you know--it's blocking itself again:

    spid 56 blocked by spid 56, waittime 17840231, waittype LCK_M_X, waitresource KEY: 2:562949956108288, cpu 20858, physical_io 1585, open tran 2, status suspended, cmd SELECT INTO, DBCC INPUTBUFFER reveals last statement is the one starting

    SELECT Info = 'Please make sure you''re in the grid mode when running these tests.'

    Any thoughts?

    Given the crazy waittime, and low cpu and io, it seems it got blocked really early and then sat there forever. I bet if I kill it nothing will happen, and if I wait till tomorrow nothing will happen. Sigh.

  • Apologies... I have no clue what the problem could be. We have a VM environment at work and it screams.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • kinda like those Oldsmobile ads -- "this isn't your father's testing environment."

    maybe best stated by Obi Wan Kanobi, "These aren't the test results you're looking for."

  • These types of problems with VM are typically are not code related. A lot of folks setup VM to move resources to and from SQLServer servers and, when such moves occur, some really bad things can happen. I don't know if that's Erik's case but thought it was worth mentioning.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Second try: failed. 3 days and 20-some-odd hours later, exact same results as before, only with wait time 334,605,998. 🙂

    I think I will not be running this again on my VM SQL instance. That's okay, there are plenty other places I can try it...

    The entire SQL Server could not be used for any other purpose for 2 days. I have something logging to a table every minute and it could not operate.

  • ErikEckhardt (3/19/2012)


    Second try: failed. 3 days and 20-some-odd hours later, exact same results as before, only with wait time 334,605,998. 🙂

    I think I will not be running this again on my VM SQL instance. That's okay, there are plenty other places I can try it...

    The entire SQL Server could not be used for any other purpose for 2 days. I have something logging to a table every minute and it could not operate.

    That's really odd. No problems here with the script. Were you having high CPU or I/O?

    /* Anything is possible but is it worth it? */

  • Great article Jeff. I used the splitter today to break apart filenames - amazing performance for our test runs. Can't wait to let it loose across the live data on Monday - 120,000 filenames. Thanks!

    :exclamation: "Be brave. Take risks. Nothing can substitute experience." :exclamation:

  • Very cool. Thanks for the success story, Robin. Glad it helped.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi,

    I have used and referred the same splitter many times but need this to be implemented in a real time environment recently. So, I had to do some more research on it. I started with some minor improvements and ask for suggestions on the sub-site. Many thanks to Oleg, who provided another pointer with a very fruitful discussion and with an interesting 1133 elements, 10000 rows performance test (I am not going into the details and performance comparisons made). I tweaked the physical structure of the tally table a little bit according to the requirement by adding a pre-calculated column N1 which would have the value N+1 but there was only a minor gain as it reduces some IO. But the most important change was conversion of tally table’s column’s data-type i.e. from BIGINT to INT. I have come up with 2 versions so far.

    Now, most importantly the performance test. I followed the article’s performance test and I saw major improvement and it beats all the versions. The margin was more significant when the number of elements increases or when the size of the string was bigger. When I used this version of the splitter, for 1133 elements & 10000 rows, the test completed in approximately 19 seconds, which seems very promising as these are still iTVFs.

    USE [tempdb]

    GO

    SET NOCOUNT ON;

    GO

    IF OBJECT_ID('dbo.CsvTest') IS NOT NULL

    DROP TABLE CsvTest

    GO

    DECLARE @NumberOfElements INT,

    @NumberOfRows INT

    /*======== PARAMETER VALUES ==============================*/

    SELECT @NumberOfElements= 1133

    , @NumberOfRows= 10000

    /*========================================================*/

    SELECT TOP (@NumberOfRows) --Controls the number of rows in the test table

    ISNULL(ROW_NUMBER() OVER (ORDER BY(SELECT NULL)),0) AS RowNum,

    (

    SELECT CAST(STUFF( --=== STUFF get's rid of the leading comma

    ( --=== This builds CSV row with a leading comma

    SELECT TOP (@NumberOfElements) --Controls the number of CSV elements in each row

    ','+CAST(ABS(CHECKSUM(NEWID()))%100000 AS VARCHAR(10))

    FROM sys.All_Columns t3 --Classic cross join pseudo-cursor

    CROSS JOIN sys.All_Columns t4 --can produce row sets up 121 million.

    WHERE t1.Object_ID <> t3.Object_ID --Without this line, all rows would be the same

    FOR XML PATH('')

    )

    ,1,1,'') AS VARCHAR(8000))

    ) AS Csv

    INTO CsvTest

    FROM sys.All_Columns t1 --Classic cross join pseudo-cursor

    CROSS JOIN sys.All_Columns t2 --can produce row sets up 16 million rows

    GO

    ALTER TABLE CsvTest

    ADD PRIMARY KEY CLUSTERED (RowNum) WITH FILLFACTOR = 100;

    GO

    IF OBJECT_ID('dbo.Tally','U') IS NOT NULL

    DROP TABLE dbo.Tally;

    --===== Create and populate the Tally table on the fly.

    SELECT TOP 11000

    IDENTITY(INT,1,1) AS N, --PREVIOUSLY WAS BIGINT

    -- This ISNULL function makes the column NOT NULL

    ISNULL(ISNULL(CAST(ROW_NUMBER() OVER ( ORDER BY (SELECT NULL)) AS INT),0)+ 1, 0) N1

    INTO dbo.Tally

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    ;

    GO

    --===== Add a CLUSTERED Primary Key to maximize performance

    ALTER TABLE dbo.Tally

    ADD CONSTRAINT PK_Tally_N

    PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100

    ;

    --===== Update Stats

    UPDATE STATISTICS dbo.Tally WITH FULLSCAN

    ;

    GO

    --=====================================================================================================================

    -- Conditionally drop and recreate the Split8KTally_With_Column_N1 splitter function.

    --=====================================================================================================================

    IF OBJECT_ID(N'dbo.Split8KTally_With_Column_N1', N'IF') IS NOT NULL

    DROP FUNCTION dbo.Split8KTally_With_Column_N1;

    GO

    --===== Tally Table (Split8KTally_With_Column_N1 ITVF) ===============================================================================

    /*===== THIS FUNCTION USES COLUMN "N1" OF TALLY TABLE HAVING PRE-CALCULATED VALUES FOR N+1 ===========*/

    CREATE FUNCTION dbo.Split8KTally_With_Column_N1 (

    @pString VARCHAR(8000), @pDelimiter VARCHAR(1)

    )

    RETURNS TABLE

    AS

    RETURN

    /*======= FIND THE FIRST DELIMITED ITEM EXPLICITLY, AS WE SKIPPED THE FIRST ITEM TO AVOID "OR" CONDITION IN THE WHERE CLAUSE=====*/

    SELECT ItemNumber = CAST(0 AS BIGINT) --STARTING WITH ZERO FOR SORT AND TO AVOID ROW_NUMBER + 1 LATER

    , ItemValue = SUBSTRING(@pString, 1,

    ISNULL(NULLIF(CHARINDEX(@pDelimiter,

    @pString COLLATE Latin1_General_BIN), 0)

    - 1, 8000 ))

    UNION ALL -- CONCATENATE THE OUTPUTS

    /*======= FIND REST OF THE DELIMITED ITEMS USING TALLY TABLE ======================================*/

    SELECT ItemNumber = ROW_NUMBER() OVER ( ORDER BY [N] )

    , ItemValue = SUBSTRING(@pString, [N1],

    ISNULL(NULLIF(CHARINDEX(@pDelimiter ,

    @pString COLLATE Latin1_General_BIN, [N1]), 0)

    - [N1], 8000) )

    FROM [dbo].[Tally] WITH ( NOLOCK ) --NOLOCK HINT IS NOT NECESSARY

    WHERE [N] BETWEEN 1 AND LEN(@pString)

    AND SUBSTRING(@pString, [N], 1) = @pDelimiter

    GO

    --=====================================================================================================================

    -- Conditionally drop and recreate the Split8KTally_With_N_PLUS_1 splitter function.

    --=====================================================================================================================

    IF OBJECT_ID('dbo.Split8KTally_With_N_PLUS_1','IF') IS NOT NULL

    DROP FUNCTION [dbo].Split8KTally_With_N_PLUS_1

    GO

    --===== Tally Table (Split8KTally_With_N_PLUS_1 ITVF) ===============================================================================

    /*===== THIS FUNCTION USES RUNTIME CALCULATION "N + 1" INSTEAD OF USING THE PRE-CALCULATED COLUMN "N1" OF THE TALLY TABLE===========*/

    CREATE FUNCTION dbo.Split8KTally_With_N_PLUS_1 (

    @pString VARCHAR(8000), @pDelimiter VARCHAR(1)

    )

    RETURNS TABLE

    AS

    RETURN

    /*======= FIND THE FIRST DELIMITED ITEM EXPLICITLY, AS WE SKIPPED THE FIRST ITEM TO AVOID "OR" CONDITION IN THE WHERE CLAUSE=====*/

    SELECT ItemNumber = CAST(0 AS BIGINT) --STARTING WITH ZERO FOR SORT AND TO AVOID ROW_NUMBER + 1 LATER

    , ItemValue = SUBSTRING(@pString, 1,

    ISNULL(NULLIF(CHARINDEX(@pDelimiter,

    @pString COLLATE Latin1_General_BIN), 0)

    - 1, 8000 ))

    UNION ALL -- CONCATENATE THE OUTPUTS

    /*======= FIND REST OF THE DELIMITED ITEMS USING TALLY TABLE ======================================*/

    SELECT ItemNumber = ROW_NUMBER() OVER ( ORDER BY [N] )

    , ItemValue = SUBSTRING(@pString, (N+1),

    ISNULL(NULLIF(CHARINDEX(@pDelimiter ,

    @pString COLLATE Latin1_General_BIN, (N+1)), 0)

    - (N+1), 8000) )

    FROM [dbo].[Tally] WITH ( NOLOCK )--NOLOCK HINT IS NOT NECESSARY

    WHERE [N] BETWEEN 1 AND LEN(@pString)

    AND SUBSTRING(@pString, [N], 1) = @pDelimiter

    GO

    --Tally Test

    PRINT '/*====== dbo.Split8KTally_With_N_PLUS_1 =================*/'

    DBCC FREEPROCCACHE WITH NO_INFOMSGS

    SET STATISTICS TIME ON

    DECLARE @ItemNumber BIGINT

    , @Item VARCHAR(8000) ;

    SELECT @ItemNumber = V.ItemNumber

    , @Item = V.ItemValue

    FROM dbo.CsvTest D

    CROSS APPLY dbo.Split8KTally_With_N_PLUS_1(D.Csv, ',') V

    SET STATISTICS TIME OFF

    PRINT '/*========================================================*/'

    PRINT CHAR(10) + CHAR(13)

    GO

    PRINT '/*====== dbo.Split8KTally_With_Column_N1 =================*/'

    DBCC FREEPROCCACHE WITH NO_INFOMSGS

    SET STATISTICS TIME ON

    DECLARE @ItemNumber BIGINT

    , @Item VARCHAR(8000) ;

    SELECT @ItemNumber = V.ItemNumber

    , @Item = V.ItemValue

    FROM dbo.CsvTest D

    CROSS APPLY dbo.Split8KTally_With_Column_N1(D.Csv, ',') V

    SET STATISTICS TIME OFF

    PRINT '/*========================================================*/'

    PRINT CHAR(10) + CHAR(13)

    GO

    Last but not least, thanks to all who are contributing voluntarily to help the community, however, cannot thanks enough to one of my favorite mentors Jeff Moden.

    P.S. The naming conventions are not according to the standards as the purpose is to convey the core of the function 🙂 Moreover, we all know everyone’s mileage may differ.

  • EDIT:

    Testing for VARCHAR(MAX) version was being done on SQL 2005 only.

    /EDIT

    Since, in the future, we may need the VARCHAR(MAX) version as well, I did some testing on the other solutions as tally table solution does not behave well with wider strings. Two of them performed really well

    1.Inline Recursive CTE solution (Split_RCTE, Tweaked a little bit for better performance)

    2.TVF XML solution by Oleg (Split_XML_Solution_By_Oleg)

    But surprisingly, RCTE was winner on some occasions with good margin as compared to losing. If I do the same 10000 rows, 1133 elements with 125-150 characters width, RCTE beat the XML solution marginally. But if the numbers of elements are decreased to 500, the RCTE was almost twice as fast as XML. But I would not count out any of the solutions, as they could fit according the requirement and environment.

    For RCTE solution, I must say it is more resource intensive. The index spool and SORT operators in the execution plan clearly indicates that it would hit the memory, tempdb and processors hard. It also reflects that with more CPUs, memory and capacity planned tempdb, the solution is viable to a better performance as compared to other solutions. Having said that, I still may tilt towards the XML solution.

    To me, why the RCTE performs much better than the tally table solution (both used the Charindex and Substring), is because the tally table solution compare each character with the delimiter, and due to the Out of Row phenomenon, this does not scale well. Whereas, RCTE solution does it for only the required number of times. Moreover, till VARCHAR(8000), the SUBSTRING(@pString, N,1) = @pDelimiter is dealt as a predicate in addition to the Seek predicate. But for VARCHAR(MAX), this is divided into two steps i.e. A Seek predicate followed by a Filter predicate which decreases the performance quite a bit.

    BTW, this is what my tally table VARCHAR(max) version is

    SET QUOTED_IDENTIFIER ON

    SET ANSI_NULLS ON

    GO

    --Create Split Tally Function VARCHAR(MAX) Version

    CREATE FUNCTION dbo.Split8KTallyM (

    @pString VARCHAR(MAX), @pDelimiter VARCHAR(1)

    )

    RETURNS @Results TABLE (ItemNumber BIGINT, ItemValue VARCHAR(MAX))

    AS

    BEGIN

    SET @pString = @pDelimiter --To avoid the OR condition in the WHERE clause for the first item

    + @pString

    + @pDelimiter --To avoid the ISNULL/NULLIF

    INSERT INTO @Results

    (

    [ItemNumber]

    ,[ItemValue]

    )

    SELECT ItemNumber = ROW_NUMBER() OVER ( ORDER BY N )

    , Item = SUBSTRING(@pString, N1,

    CAST(CHARINDEX(@pDelimiter ,

    @pString COLLATE Latin1_General_BIN, N1) AS INT)

    - N1)

    FROM [dbo].[Tally] WITH ( NOLOCK ) --NOLOCK HINT IS NOT NECESSARY

    WHERE N BETWEEN 1 AND

    CONVERT(INT,LEN(@pString)) -- CAST TO THE COLUMN DATATYPE

    - 1 -- TO EXCLUDE LAST DELIMITER

    AND SUBSTRING(@pString, N,1) = @pDelimiter

    RETURN;

    END

    GO

    Now the performance test for RCTE and XML solutions (Did not try the tally table solution as it is quite slow)

    USE [tempdb]

    GO

    SET NOCOUNT ON;

    GO

    IF OBJECT_ID(N'dbo.iFunction', N'V') IS NOT NULL

    DROP VIEW iFunction

    GO

    SET QUOTED_IDENTIFIER ON

    SET ANSI_NULLS ON

    GO

    CREATE VIEW dbo.iFunction AS

    /**********************************************************************************************************************

    Purpose:

    This view is callable from UDF's which allows us to indirectly get a NEWID() within a function where we can't do such

    a thing directly in the function. This view also solves the same problem for GETDATE().

    Usage:

    SELECT MyNewID FROM dbo.iFunction; --Returns a GUID

    SELECT MyDate FROM dbo.iFunction; --Returns a Date

    Revision History:

    Rev 00 - 06 Jun 2004 - Jeff Moden - Initial creation

    Rev 01 - 06 Mar 2011 - Jeff Moden - Formalize code. No logic changes.

    **********************************************************************************************************************/

    SELECT MyNewID = NEWID(),

    MyDate = GETDATE();

    GO

    IF OBJECT_ID(N'dbo.CsvTest', N'U') IS NOT NULL

    DROP TABLE CsvTest

    GO

    DECLARE @MaxElementWidth INT,

    @MinElementWidth INT,

    @NumberOfElements INT,

    @NumberOfRows INT

    /*======== PARAMETER VALUES ==============================*/

    SELECT @MaxElementWidth= 150

    , @MinElementWidth= 125

    , @NumberOfElements= 500

    , @NumberOfRows= 10000

    /*========================================================*/

    SELECT TOP (@NumberOfRows) --Controls the number of rows in the test table

    ISNULL(ROW_NUMBER() OVER (ORDER BY(SELECT NULL)),0) AS RowNum,

    CSV =

    (--==== This creates each CSV

    SELECT CAST(

    STUFF( --=== STUFF get's rid of the leading comma

    ( --=== This builds CSV row with a leading comma

    SELECT TOP (@NumberOfElements) --Controls the number of CSV elements in each row

    ','

    + LEFT(--==== Builds random length variable within element width constraints

    LEFT(REPLICATE('1234567890',CEILING(@MaxElementWidth/10.0)), @MaxElementWidth),

    ABS(CHECKSUM((SELECT MyNewID FROM dbo.iFunction)))

    % (@MaxElementWidth - @MinElementWidth + 1) + @MinElementWidth

    )

    FROM sys.All_Columns ac3 --Classic cross join pseudo-cursor

    CROSS JOIN sys.All_Columns ac4 --can produce row sets up 16 million.

    WHERE ac3.Object_ID <> ac1.Object_ID --Without this line, all rows would be the same.

    FOR XML PATH('')

    )

    ,1,1,'')

    AS VARCHAR(MAX))

    )

    INTO CsvTest

    FROM sys.All_Columns ac1 --Classic cross join pseudo-cursor

    CROSS JOIN sys.All_Columns ac2 --can produce row sets up 16 million rows

    GO

    PRINT '/*====== dbo.CSVTest Population completed ================*/'

    PRINT CHAR(10) + CHAR(13)

    GO

    ALTER TABLE CsvTest

    ADD PRIMARY KEY CLUSTERED (RowNum) WITH FILLFACTOR = 100;

    GO

    IF OBJECT_ID(N'dbo.Split_RCTE', N'IF') IS NOT NULL

    DROP FUNCTION dbo.Split_RCTE

    GO

    SET QUOTED_IDENTIFIER ON

    SET ANSI_NULLS ON

    GO

    --Create Split_RCTE function VARCHAR(MAX) version

    CREATE FUNCTION dbo.Split_RCTE

    (

    @pString VARCHAR(MAX)

    ,@pDelimiter VARCHAR(1)

    )

    RETURNS TABLE

    WITH SCHEMABINDING

    AS

    RETURN

    WITH cteSplit

    AS ( SELECT StartPosition = 0

    , EndPosition = CONVERT(INT, CHARINDEX(@pDelimiter,

    @pString COLLATE Latin1_General_BIN))

    UNION ALL

    SELECT StartPosition = EndPosition + 1

    , EndPosition = CONVERT(INT, CHARINDEX(@pDelimiter,

    @pString COLLATE Latin1_General_BIN,

    EndPosition + 1))

    FROM cteSplit

    WHERE EndPosition > 0

    )

    SELECT [ItemNumber] = ROW_NUMBER() OVER ( ORDER BY StartPosition )

    , SUBSTRING(@pString, StartPosition,

    CASE EndPosition

    WHEN 0 THEN CONVERT(INT, LEN(@pString)) + 1

    ELSE EndPosition - StartPosition

    END) ItemValue

    FROM cteSplit

    GO

    IF OBJECT_ID(N'dbo.Split_XML_Solution_By_Oleg', N'TF') IS NOT NULL

    DROP FUNCTION dbo.Split_XML_Solution_By_Oleg

    GO

    --Create Split_XML function VARCHAR(MAX) version

    SET QUOTED_IDENTIFIER ON

    SET ANSI_NULLS ON

    GO

    CREATE FUNCTION dbo.Split_XML_Solution_By_Oleg

    (

    @Parameter VARCHAR(MAX)

    ,@Delimiter VARCHAR(1)

    )

    RETURNS @Result TABLE

    (

    ItemNumber INT

    ,ItemValue VARCHAR(MAX)

    )

    AS

    BEGIN

    DECLARE @XML XML ;

    SET @Parameter = ( SELECT @Parameter

    FOR XML PATH('')

    ) ;

    SELECT @XML = '<r>' + REPLACE(@Parameter,@Delimiter, '</r><r>') + '</r>' ;

    INSERT INTO @Result

    (

    ItemNumber

    ,ItemValue

    )

    SELECT ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL) ) AS ItemNumber

    , Item.value('text()[1]', 'VARCHAR(MAX)') AS ItemValue

    FROM @XML.nodes('//r') R ( Item ) ;

    RETURN ;

    END ;

    GO

    PRINT '/*====== dbo.Split_RCTE ==================================*/'

    DBCC FREEPROCCACHE WITH NO_INFOMSGS

    SET STATISTICS TIME ON

    DECLARE @ItemNumber BIGINT

    , @Item VARCHAR(MAX) ;

    SELECT @ItemNumber = V.ItemNumber

    , @Item = V.ItemValue

    FROM dbo.CsvTest D

    CROSS APPLY dbo.Split_RCTE(D.Csv, ',') V

    OPTION ( MAXRECURSION 0 )

    SET STATISTICS TIME OFF

    PRINT '/*========================================================*/'

    PRINT CHAR(10) + CHAR(13)

    GO

    PRINT '/*====== dbo.Split_XML ===================================*/'

    DBCC FREEPROCCACHE WITH NO_INFOMSGS

    SET STATISTICS TIME ON

    DECLARE @ItemNumber BIGINT

    , @Item VARCHAR(MAX) ;

    SELECT @ItemNumber = V.ItemNumber

    , @Item = V.ItemValue

    FROM dbo.CsvTest D

    CROSS APPLY dbo.Split_XML_Solution_By_Oleg(D.Csv, ',') V

    SET STATISTICS TIME OFF

    PRINT '/*========================================================*/'

    PRINT CHAR(10) + CHAR(13)

    GO

    As always, everyone's mileage may differ.

  • Too bad (for me) this new round cames at a bad time as I will be too bussy this week to particpate in this round of challenges.

    As for bad performance with some implementations, please try to bench with option( maxdop 1) too as there exists a bug in SQL Server 2008 and later that can (unpredictably) cause extreme slowdowns when parallel plans are used. We would not want to see results tainted by this of course.

Viewing 15 posts - 286 through 300 (of 990 total)

You must be logged in to reply to this topic. Login to reply