Creating Delimited Strings

  • Ah... thanks, Brandie.

    As a sidebar, I hate articles like the one on Simple-Talk and the duplicate on the Data Exchange link. There's absolutely no proof of performance of any of the methods. Whether it happened that way or not, it looks like the author simply Googled a bunch a stuff and threw it together in an article.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I hear ya Jeff...I am planning on addressing Mark's question and testing the CLR method for performance...it would be cool to have this as a drop-in but I must admit Mark's question set me off on sorting possibilities.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Update from the front. I have replaced my multiple inserts with an UNPIVOT. Rather than compare 15 different columns as listed above, I've done the following:

    ALTER TABLE DT.Staging

    ADD TempRecID int IDENTITY(1,1) NOT NULL;

    --Identity field necessary for UNPIVOT to give me my field list

    CREATE TABLE dbo.#TempStage (DocID int, Col1 varchar(30), Col2 varchar(30), <<Numerous fields>>);

    GO

    --This table is necessary because my Staging table uses different size varchar & char fields,

    -- so UNPIVOT doesn't work with it. All columns must be sized & typed the same

    INSERT INTO dbo.#TempStage (DocID, Col1, Col2, <<Numerous fields>>)

    SELECT DocID, ISNULL(Col1,''), ISNULL(Col2,''), <<Numerous fields>>

    FROM DT.Staging

    WHERE Col1 IS NULL OR Col1='' OR Col2='' OR Col2 IS NULL

    <<Numerous fields>>;

    --Find all my missing values

    CREATE TABLE dbo.#TempErrors (DocID int, FieldType varchar(20), ErrorDetail varchar(400));

    INSERT INTO dbo.#TempErrors (DocID, FieldType)

    SELECT DocID, TempRecID AS FieldType

    FROM (SELECT DocID, Col1, Col2, <<Numerous fields>>

    FROM dbo.#TempStage) stg

    UNPIVOT (FieldValue FOR TempRecID IN (Col1, Col2, <<Numerous fields>>) ) AS Error

    WHERE FieldValue = '';

    UPDATE samf

    SET ErrorFlag = 1, ErrorDetail = samf2.ErrorDetail

    FROM DT.Staging samf

    INNER JOIN (SELECT DocID,stuff(( SELECT ',' + FieldType

    FROM dbo.#TempErrors s2

    WHERE s2.DocID= s1.DocID --- must match GROUP BY below

    ORDER BY DocID

    FOR XML PATH('')

    ),1,1,'') as [ErrorDetail]

    FROM dbo.#TempErrors s1

    GROUP BY s1.DocID /* without GROUP BY multiple rows are returned */ ) samf2

    ON samf.DocID = samf2.DocID;

    --Final Update

    ALTER TABLE DT.Staging

    DROP COLUMN TempRecID;

    DROP TABLE dbo.#TempStage;

    DROP TABLE dbo.#TempErrors;

    Well, that was harder than it needed to be. And the solution is easier than I expected. Now I just have to replace my field values with meaningful error messages and I'm set.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Not surprising given the usual finds on the internets but the CLR code would not compile out the box...however after some polishing the little nugget works.

    Mark, sorting is a can of worms...it can be done but I'm leaving the can alone for now. It looks like from Brandie's desired output in the OP sorting may not be a requirement for her project but sorting options could definitely be useful for other applications.

    Attached is a rar archive containing:

    1. VS2010 Solution with a SQL CLR project that can build and deploy group_concat. * See below for info on deployng the bundled binary manually if you don't have VS2010 or don't care to bother with it.

    2. SQL Script "enable_CLR.sql" that enables CLR for a particular DB.

    3. SQL script "test_query.sql" that demostrates how dbo.group_concat might be used.

    I likely will not have time to do any worthwhile performance testing today but I figured I would tee up the objects for easy deployment in case others have some cycles.

    * To deploy manually using sqlcmd:

    1. Unpack archive to a disk local to SQL Server (my examples assume a home dir of C:\@\).

    2. Open "C:\@\Stuff_2010\enable_CLR.sql" in SSMS and compile against your DB.

    3.1. Open C:\@\Stuff_2010\group_concat\bin\Release\group_concat.sql in a text editor or SSMS and change test on this line to the name of your DB

    :setvar DatabaseName "test"

    3.2. Open a command prompt and run this to deploy the assembly and function to your DB:

    sqlcmd -S ".\SQLEXPRESS_2008" -E -i "C:\@\Stuff_2010\group_concat\bin\Release\group_concat.sql"

    Edit: fix path to group_concat.sql in sqlcmd command line

    Edit 2: forgot to include info on changing the DB name in the deploy script (3.1.)

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • My only issue with sorting was making sure I didn't have two copies of the same record with the errors flipped (Ship Date; Sale Date vs Sale Date; Ship Date). So, no, I don't require you to play with that particular can.

    It might be interesting, though, to go back to that option when there's time.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • The aggregate I posted will not prevent duplicates from entering the concatenated string. You can however provide the DISTINCT keyword. The example in my attachment has this query:

    SELECT DocID,

    dbo.group_concat(FieldType) AS FieldTypeDetail

    FROM #TempErrors

    GROUP BY DocID

    ORDER BY DocID ;

    So you would just do this if you wanted to prevent dups:

    SELECT DocID,

    dbo.group_concat(DISTINCT FieldType) AS FieldTypeDetail

    FROM #TempErrors

    GROUP BY DocID

    ORDER BY DocID ;

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Brandie Tarvin (4/7/2011)


    Well, that was harder than it needed to be. And the solution is easier than I expected. Now I just have to replace my field values with meaningful error messages and I'm set.

    Brandie, do you have any "special" characters in your "ErrorDetail" column?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/7/2011)


    Brandie Tarvin (4/7/2011)


    Well, that was harder than it needed to be. And the solution is easier than I expected. Now I just have to replace my field values with meaningful error messages and I'm set.

    Brandie, do you have any "special" characters in your "ErrorDetail" column?

    Other than delimiting the string with a semi-colon or a comma, no. It's all alpha a-z.

    EDIT: Meaningful error messages are like "Last Name is required", "Sale Date is required", etc. I'm just appending "is required" to the column names, and spacing out the column names so they're user-readable.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • The reason I was asking is that if you had special characters such as < or >, etc, then we'd have to add just a little more code (which also makes it about 20% slower) in the form of the word TYPE in order for them to come out correctly.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • That's what I figured. But fortunately, that's not a problem for me today.

    Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

  • Jeff Moden (4/8/2011)


    The reason I was asking is that if you had special characters such as < or >, etc, then we'd have to add just a little more code (which also makes it about 20% slower) in the form of the word TYPE in order for them to come out correctly.

    I have built several CLR Aggregates that offer comparable functionality to the XML method with respect to sorting and optional separator and have tuned them to the best of my ability. I am almost done documenting the performance of the XML and CLR Aggregate methods but I want to run with a few more sets of test data before I post it. I would also like to include tests when data could contain < or >. Can someone provide a code sample using the XML method that would account for it?

    Original sample w/out TYPE:

    SELECT DocID,stuff(( SELECT ',' + FieldType

    FROM #TempErrors s2

    WHERE s2.DocID= s1.DocID --- must match GROUP BY below

    ORDER BY DocID

    FOR XML PATH('')

    ),1,1,'') as [Skills]

    FROM #TempErrors s1

    GROUP BY s1.DocID --- without GROUP BY multiple rows are returned

    ORDER BY s1.DocID

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • The attachment contains the complete VS2010 solution and a deploy script which will allow you to deploy the CLR objects without using VS2010.

    Enjoy 🙂

    SET STATISTICS TIME OFF ;

    GO

    SET NOCOUNT ON ;

    GO

    USE test

    GO

    IF OBJECT_ID(N'test.dbo.OPC3Test') > 0

    DROP TABLE test.dbo.OPC3Test ;

    GO

    CREATE TABLE test.dbo.OPC3Test

    (

    DocID INT NOT NULL,

    FieldType NVARCHAR(20) NOT NULL,

    ErrorDetail NVARCHAR(400) NULL

    ) ;

    GO

    CREATE CLUSTERED INDEX [dbo.OPC3Test_DocID__INC_FieldType]

    ON test.dbo.OPC3Test (DocID,FieldType)

    GO

    INSERT INTO test.dbo.OPC3Test

    (

    DocID,

    FieldType,

    ErrorDetail

    )

    SELECT 1,

    'Sale Date',

    'Invalid Sale Date'

    UNION ALL

    SELECT 1,

    'DocumentNumber',

    'DocumentNumber not a number'

    UNION ALL

    SELECT 1,

    'Sale Date',

    'Sale Date Before Open Date'

    UNION ALL

    SELECT 2,

    'First Name',

    'Empty First Name'

    UNION ALL

    SELECT 3,

    'Last Name',

    'Last Name cannot be NULL'

    UNION ALL

    SELECT 3,

    'DocumentNumber',

    'DocumentNumber not a number'

    UNION ALL

    SELECT 8,

    'City',

    'City not found in State'

    UNION ALL

    SELECT 999,

    'IsAllocated',

    NULL

    UNION ALL

    SELECT 3330,

    'IsUtilized',

    NULL

    UNION ALL

    SELECT 3330,

    'World',

    'Hello!'

    GO 30000 -- << SSMS feature to repeat batches multiple times

    DECLARE @ct INT ;

    SELECT @ct = COUNT(*)

    FROM test.dbo.OPC3Test ;

    RAISERROR('%d rows in test.dbo.OPC3Test',10,1,@ct) ;

    GO

    PRINT ''

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ' DISTINCT, unsorted'

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ''

    PRINT '========== XML Method ======================================'

    SET STATISTICS TIME ON ;

    SELECT DocID,

    STUFF((SELECT DISTINCT

    N',' + ErrorDetail

    FROM test.dbo.OPC3Test s2

    WHERE s2.DocID = s1.DocID

    FOR

    XML PATH('')

    ), 1, 1, '') AS [Skills]

    FROM test.dbo.OPC3Test s1

    GROUP BY s1.DocID

    ORDER BY s1.DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat (no custom inputs) ============='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat(DISTINCT ErrorDetail) AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_delim (custom inputs) =========='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_delim(DISTINCT ErrorDetail, N',') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ' DISTINCT, sorted ASC'

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ''

    PRINT '========== XML Method ======================================'

    SET STATISTICS TIME ON ;

    SELECT DocID,

    STUFF((SELECT DISTINCT

    N',' + ErrorDetail

    FROM test.dbo.OPC3Test s2

    WHERE s2.DocID = s1.DocID

    ORDER BY N',' + ErrorDetail ASC

    FOR

    XML PATH('')

    ), 1, 1, '') AS [Skills]

    FROM test.dbo.OPC3Test s1

    GROUP BY s1.DocID

    ORDER BY s1.DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_sorted (custom inputs) ========='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_sorted(DISTINCT ErrorDetail, N'ASC') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_delim_sorted (custom inputs) ==='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_delim_sorted(DISTINCT ErrorDetail, N',', N'ASC') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ' DISTINCT, sorted DESC'

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ''

    PRINT '========== XML Method ======================================'

    SET STATISTICS TIME ON ;

    SELECT DocID,

    STUFF((SELECT DISTINCT

    N',' + ErrorDetail

    FROM test.dbo.OPC3Test s2

    WHERE s2.DocID = s1.DocID

    ORDER BY N',' + ErrorDetail DESC

    FOR

    XML PATH('')

    ), 1, 1, '') AS [Skills]

    FROM test.dbo.OPC3Test s1

    GROUP BY s1.DocID

    ORDER BY s1.DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_sorted (custom inputs) ========='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_sorted(DISTINCT ErrorDetail, N'DESC') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_delim_sorted (custom inputs) ==='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_delim_sorted(DISTINCT ErrorDetail, N',', N'DESC') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ' NON-DISTINCT, unsorted'

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ''

    PRINT '========== XML Method ======================================'

    SET STATISTICS TIME ON ;

    SELECT DocID,

    STUFF((SELECT N',' + ErrorDetail

    FROM test.dbo.OPC3Test s2

    WHERE s2.DocID = s1.DocID

    FOR

    XML PATH('')

    ), 1, 1, '') AS [Skills]

    FROM test.dbo.OPC3Test s1

    GROUP BY s1.DocID

    ORDER BY s1.DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat (no custom inputs) ============='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat(ErrorDetail) AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_delim (custom inputs) =========='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_delim(ErrorDetail, N',') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ' NON-DISTINCT, sorted ASC'

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ''

    PRINT '========== XML Method ======================================'

    SET STATISTICS TIME ON ;

    SELECT DocID,

    STUFF((SELECT N',' + ErrorDetail

    FROM test.dbo.OPC3Test s2

    WHERE s2.DocID = s1.DocID

    ORDER BY N',' + ErrorDetail ASC

    FOR

    XML PATH('')

    ), 1, 1, '') AS [Skills]

    FROM test.dbo.OPC3Test s1

    GROUP BY s1.DocID

    ORDER BY s1.DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_sorted (custom inputs) ========='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_sorted(ErrorDetail, N'ASC') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_delim_sorted (custom inputs) ==='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_delim_sorted(ErrorDetail, N',', N'ASC') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    GO

    PRINT ''

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ' NON-DISTINCT, sorted DESC'

    PRINT '-------------------------------------------------------------------------------------------'

    PRINT ''

    PRINT '========== XML Method ======================================'

    SET STATISTICS TIME ON ;

    SELECT DocID,

    STUFF((SELECT N',' + ErrorDetail

    FROM test.dbo.OPC3Test s2

    WHERE s2.DocID = s1.DocID

    ORDER BY N',' + ErrorDetail DESC

    FOR

    XML PATH('')

    ), 1, 1, '') AS [Skills]

    FROM test.dbo.OPC3Test s1

    GROUP BY s1.DocID

    ORDER BY s1.DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_sorted (custom inputs) ========='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_sorted(ErrorDetail, N'DESC') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    PRINT ''

    PRINT '========== CLR group_concat_delim_sorted (custom inputs) ==='

    SET STATISTICS TIME ON ;

    SELECT DocID,

    test.dbo.group_concat_delim_sorted(ErrorDetail, N',', N'DESC') AS FieldTypeDetail

    FROM test.dbo.OPC3Test

    GROUP BY DocID

    ORDER BY DocID ;

    SET STATISTICS TIME OFF ;

    GO


    [font="Courier New"]------------------------------------------------------------

    DISTINCT, unsorted

    ------------------------------------------------------------

    ========== XML Method ======================================

    SQL Server Execution Times:

    CPU time = 219 ms, elapsed time = 269 ms.

    ========== CLR group_concat (no custom inputs) =============

    SQL Server Execution Times:

    CPU time = 203 ms, elapsed time = 190 ms.

    ========== CLR group_concat_delim (custom inputs) ==========

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 249 ms.

    ------------------------------------------------------------

    DISTINCT, sorted ASC

    ------------------------------------------------------------

    ========== XML Method ======================================

    SQL Server Execution Times:

    CPU time = 234 ms, elapsed time = 260 ms.

    ========== CLR group_concat_sorted (custom inputs) =========

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 245 ms.

    ========== CLR group_concat_delim_sorted (custom inputs) ===

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 278 ms.

    ------------------------------------------------------------

    DISTINCT, sorted DESC

    ------------------------------------------------------------

    ========== XML Method ======================================

    SQL Server Execution Times:

    CPU time = 235 ms, elapsed time = 259 ms.

    ========== CLR group_concat_sorted (custom inputs) =========

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 248 ms.

    ========== CLR group_concat_delim_sorted (custom inputs) ===

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 280 ms.

    ------------------------------------------------------------

    NON-DISTINCT, unsorted

    ------------------------------------------------------------

    ========== XML Method ======================================

    SQL Server Execution Times:

    CPU time = 391 ms, elapsed time = 685 ms.

    ========== CLR group_concat (no custom inputs) =============

    SQL Server Execution Times:

    CPU time = 453 ms, elapsed time = 858 ms.

    ========== CLR group_concat_delim (custom inputs) ==========

    SQL Server Execution Times:

    CPU time = 578 ms, elapsed time = 1675 ms.

    ------------------------------------------------------------

    NON-DISTINCT, sorted ASC

    ------------------------------------------------------------

    ========== XML Method ======================================

    SQL Server Execution Times:

    CPU time = 609 ms, elapsed time = 878 ms.

    ========== CLR group_concat_sorted (custom inputs) =========

    SQL Server Execution Times:

    CPU time = 1203 ms, elapsed time = 1547 ms.

    ========== CLR group_concat_delim_sorted (custom inputs) ===

    SQL Server Execution Times:

    CPU time = 1203 ms, elapsed time = 1706 ms.

    ------------------------------------------------------------

    NON-DISTINCT, sorted DESC

    ------------------------------------------------------------

    ========== XML Method ======================================

    SQL Server Execution Times:

    CPU time = 532 ms, elapsed time = 767 ms.

    ========== CLR group_concat_sorted (custom inputs) =========

    SQL Server Execution Times:

    CPU time = 1234 ms, elapsed time = 1713 ms.

    ========== CLR group_concat_delim_sorted (custom inputs) ===

    SQL Server Execution Times:

    CPU time = 1297 ms, elapsed time = 1650 ms.

    [/font]

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • It looks like the SQLCLRs are getting beat up a bit by straight T-SQL. I believe there may be an optimization on the DISTINCT XML example that we can do, as well.

    I'll try to provide you with the code for your previous request which includes "type" as well as the optimization depending on what priorities I get hit with tonight.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/13/2011)


    It looks like the SQLCLRs are getting beat up a bit by straight T-SQL. I believe there may be an optimization on the DISTINCT XML example that we can do, as well.

    I'll try to provide you with the code for your previous request which includes "type" as well as the optimization depending on what priorities I get hit with tonight.

    Thanks Jeff. I created yet another set of CLR methods that use a different internal data structure to store aggregate-strings and squeezed more performance out of the CLR methods. I have CLR methods outperforming XML 50% of the time. In my opinion, when the proper CLR aggregate is chosen for the use-case, the CLR is a viable replacement for the XML method. I am very interested in seeing what kind of performance bump the DISTINCT optimization you mentioned offers, as well as what kind of performance hit we see when we the XML is setup to handle data where angle brackets are possible using TYPE.

    I only started this as an exercise to learn a bit more about the CLR but have been pleasantly surprised by the way the CLR has kept pace...I am thinking of starting a new project on CodePlex.

    [font="Courier New"]300000 rows in test.dbo.OPC3Test

    -----------------------------------------------------------------------

    DISTINCT, unsorted

    -----------------------------------------------------------------------

    ========== XML Method =================================================

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 252 ms.

    ========== CLR group_concat (no custom inputs) ========================

    SQL Server Execution Times:

    CPU time = 188 ms, elapsed time = 188 ms.

    ========== CLR group_concat_dictionary (no custom inputs) =============

    SQL Server Execution Times:

    CPU time = 187 ms, elapsed time = 187 ms.

    ========== CLR group_concat_delim (custom inputs) =====================

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 263 ms.

    ========== CLR group_concat_dictionary_delim (no custom inputs) =======

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 246 ms.

    -----------------------------------------------------------------------

    DISTINCT, sorted ASC

    -----------------------------------------------------------------------

    ========== XML Method =================================================

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 246 ms.

    ========== CLR group_concat_sorted (custom inputs) ====================

    SQL Server Execution Times:

    CPU time = 266 ms, elapsed time = 291 ms.

    ========== CLR group_concat_dictionary_sorted (custom inputs) =========

    SQL Server Execution Times:

    CPU time = 266 ms, elapsed time = 254 ms.

    ========== CLR group_concat_delim_sorted (custom inputs) ==============

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 282 ms.

    ========== CLR group_concat_dictionary_delim_sorted (custom inputs) ===

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 278 ms.

    -----------------------------------------------------------------------

    DISTINCT, sorted DESC

    -----------------------------------------------------------------------

    ========== XML Method =================================================

    SQL Server Execution Times:

    CPU time = 266 ms, elapsed time = 281 ms.

    ========== CLR group_concat_sorted (custom inputs) ====================

    SQL Server Execution Times:

    CPU time = 250 ms, elapsed time = 249 ms.

    ========== CLR group_concat_dictionary_sorted (custom inputs) =========

    SQL Server Execution Times:

    CPU time = 234 ms, elapsed time = 246 ms.

    ========== CLR group_concat_delim_sorted (custom inputs) ==============

    SQL Server Execution Times:

    CPU time = 297 ms, elapsed time = 292 ms.

    ========== CLR group_concat_dictionary_delim_sorted (custom inputs) ===

    SQL Server Execution Times:

    CPU time = 281 ms, elapsed time = 281 ms.

    -----------------------------------------------------------------------

    NON-DISTINCT, unsorted

    -----------------------------------------------------------------------

    ========== XML Method =================================================

    SQL Server Execution Times:

    CPU time = 375 ms, elapsed time = 727 ms.

    ========== CLR group_concat (no custom inputs) ========================

    SQL Server Execution Times:

    CPU time = 516 ms, elapsed time = 841 ms.

    ========== CLR group_concat_dictionary (no custom inputs) =============

    SQL Server Execution Times:

    CPU time = 344 ms, elapsed time = 801 ms.

    ========== CLR group_concat_delim (custom inputs) =====================

    SQL Server Execution Times:

    CPU time = 562 ms, elapsed time = 954 ms.

    ========== CLR group_concat_dictionary_delim (custom inputs) ==========

    SQL Server Execution Times:

    CPU time = 469 ms, elapsed time = 774 ms.

    -----------------------------------------------------------------------

    NON-DISTINCT, sorted ASC

    -----------------------------------------------------------------------

    ========== XML Method =================================================

    SQL Server Execution Times:

    CPU time = 594 ms, elapsed time = 1416 ms.

    ========== CLR group_concat_sorted (custom inputs) ====================

    SQL Server Execution Times:

    CPU time = 1171 ms, elapsed time = 1536 ms.

    ========== CLR group_concat_dictionary_sorted (custom inputs) =========

    SQL Server Execution Times:

    CPU time = 454 ms, elapsed time = 838 ms.

    ========== CLR group_concat_delim_sorted (custom inputs) ==============

    SQL Server Execution Times:

    CPU time = 1265 ms, elapsed time = 1582 ms.

    ========== CLR group_concat_dictionary_delim_sorted (custom inputs) ===

    SQL Server Execution Times:

    CPU time = 594 ms, elapsed time = 856 ms.

    -----------------------------------------------------------------------

    NON-DISTINCT, sorted DESC

    -----------------------------------------------------------------------

    ========== XML Method =================================================

    SQL Server Execution Times:

    CPU time = 547 ms, elapsed time = 717 ms.

    ========== CLR group_concat_sorted (custom inputs) ====================

    SQL Server Execution Times:

    CPU time = 1203 ms, elapsed time = 1561 ms.

    ========== CLR group_concat_dictionary_sorted (custom inputs) =========

    SQL Server Execution Times:

    CPU time = 500 ms, elapsed time = 811 ms.

    ========== CLR group_concat_delim_sorted (custom inputs) ==============

    SQL Server Execution Times:

    CPU time = 1234 ms, elapsed time = 1870 ms.

    ========== CLR group_concat_dictionary_delim_sorted (custom inputs) ===

    SQL Server Execution Times:

    CPU time = 547 ms, elapsed time = 972 ms.

    [/font]

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I guess I'm confused. I see the XLM method beat all but one or two of the CLR's in most groups in your latest test results (which are very well done, by the way).

    For the optimization I was talking about, we could start by removing an unnecessary and costly concatenation...

    ORDER BY N',' + ErrorDetail ASC

    Should be just ORDER BY ErrorDetail ASC

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 40 total)

You must be logged in to reply to this topic. Login to reply