Concatenating Rows

  • Jeff Moden (3/5/2011)


    arturv (3/4/2011)


    Hi guys,

    why dont we keep it simple?

    SELECT @fruit = COALESCE(@fruit + '', '') + name

    Because that's RBAR an solves only for one scalar return. Also, if the list is long, the code begins to slow down almost exponentially because it has to keep rebuilding the variable internally as well as growing the memory allocation for the variable which is also relatively expensive.

    Well it's not really RBAR - it is a set-based solution of sorts, but you are absolutely right to point out that it is not at all suitable for a large number of elements, or an otherwise long result.

    There are several other methods which are much better: the FOR XML PATH trick is the one most commonly employed. Yes, it may be a bit of a hack (it's an odd use of 'XML') and the contortions required to handle special characters like & can be annoying, and it won't work at all with some rare embedded characters, but even with all those caveats, it's still generally much to be preferred. Until Microsoft provide a well-performing built-in CONCAT function, it's one of the better options.

    Another high-performance solution which has no character restrictions and a more intuitive syntax is based on SQLCLR. For very large strings, the RBAR .WRITE method is surprisingly worth considering. You can find test scripts and performance comparisons here:

    http://florianreischl.blogspot.com/2010/01/concatenation-of-text-and-binary-data.html

    Paul

  • SQLkiwi (3/7/2011)[hrWell it's not really RBAR - it is a set-based solution of sorts, but you are absolutely right to point out that it is not at all suitable for a large number of elements, or an otherwise long result.

    At the risk of quibbling, yeah... it's RBAR because it requires a scalar UDF (or equivalent) to return just one text aggregate and it handles the same data over and over until the full string is constructed. 😉 Call it what you want, though, because no matter what you and I call it, we both agree that it's also called "fairly slow". 😀

    Another high-performance solution which has no character restrictions and a more intuitive syntax is based on SQLCLR. For very large strings, the RBAR .WRITE method is surprisingly worth considering. You can find test scripts and performance comparisons here:

    This DOES seem like one of those places where SQLCLR would certainly outstripe T-SQL prestidigitaion. Have you done a comparison between the two>

    Another question, please because I don't work with C#... when you speak of the .WRITE method, are you talking about T-SQL or the SQLCLR?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Never mind on the .WRITE method. I just found the answer in Flo's article that you provided a link for. Thanks for the link.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/7/2011)


    At the risk of quibbling, yeah... it's RBAR because it requires a scalar UDF (or equivalent) to return just one text aggregate and it handles the same data over and over until the full string is constructed. 😉 Call it what you want, though, because no matter what you and I call it, we both agree that it's also called "fairly slow". 😀

    I think I see what you mean, but we're talking about the "SELECT @fruit = COALESCE(@fruit + '', '') + name FROM table" method aren't we? There's no UDF or explicit loop there...or am I missing something? Quite possible, it is 4:40am!

  • This works to simplify the 'for xml path ('')' option:

    select substring((Select rtrim(ltrim(','+name)) from fruit order by id for xml path('')),2,1000) as FruitList

    The rtrim/ltrim function apparently removes the unprintable xml tokens, and the substring removes the first comma. This works consistently, where used without the rtrim/ltrim is spotty, as the author suggests.

  • SQLkiwi (3/7/2011)


    Jeff Moden (3/7/2011)


    At the risk of quibbling, yeah... it's RBAR because it requires a scalar UDF (or equivalent) to return just one text aggregate and it handles the same data over and over until the full string is constructed. 😉 Call it what you want, though, because no matter what you and I call it, we both agree that it's also called "fairly slow". 😀

    I think I see what you mean, but we're talking about the "SELECT @fruit = COALESCE(@fruit + '', '') + name FROM table" method aren't we? There's no UDF or explicit loop there...or am I missing something? Quite possible, it is 4:40am!

    True enough. If you only need to create a single text aggregate, you don't need a scalar UDF. It's still RBAR, though. Explicit loops aren't the only form of RBAR. There is such a thing as "Hidden RBAR". You remember the "Triangular Join" thing for sums that you and I had fun with? Even Ben-gan wrote about the O2 problem with using COALESCE (or ISNULL) to overwrite variables in such a fashion.

    My problem with people who use this method even on short stuff is that some unlucky sucker who has a schedule to meet may "leverage" the code for something bigger and against a whole table instead of just one "grouping".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jim Etheridge (3/7/2011)


    This works to simplify the 'for xml path ('')' option: {snip]

    The rtrim/ltrim function apparently removes the unprintable xml tokens, and the substring removes the first comma. This works consistently, where used without the rtrim/ltrim is spotty, as the author suggests.

    Thanks, Jim. 🙂 Cool tip. I'm going to have to try it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How about something like this: (similar to solution 2)

    DECLARE @listCol VARCHAR(2000)

    SELECT @listCol = STUFF(( SELECT DISTINCT ', ' + name

    FROM fruit

    ORDER BY 1

    FOR XML PATH('')

    ), 1, 2, ' ')

    select @listcol

  • Thanks for the great post. It almost completely solved an issue I had at work. I only needed the concatenation to work by group within the result set. I posted my code on my blog here:

  • solution 3

    select LEFT(concatenations,len(concatenations)-1) from

    (select (select cast(name as varchar)+','from fruit order by id for xml path('')) as concatenations)a

Viewing 10 posts - 151 through 159 (of 159 total)

You must be logged in to reply to this topic. Login to reply