A simple T-SQL statement to create a list of lookup values

  • Hi

    This is the same that I posted, please see here:

    [/url]

    like brian.healy had posted you can use xml

    select SUBSTRING((SELECT (',' + id) FROM teste ORDER BY id FOR XML PATH('') ), 2, 4000)

    adapted to your code:

    select SUBSTRING((SELECT (',' + [StatusDesc]) FROM MyStatus ORDER BY [StatusDesc] FOR XML PATH('') ), 2, 4000)

    with distinct...you may use:

    select SUBSTRING((SELECT (',' + T.[StatusDesc]) FROM (select distinct [StatusDesc] from MyStatus) T ORDER BY T.[StatusDesc] FOR XML PATH('') ), 2, 4000)

    But Thanks for your post

    LFCost

  • Jason-299789 (3/21/2011)


    Cursors in MS T-SQL should be avoided unless absolutely necessary as they are extremely slow and inefficent.

    Using one of the methods mentioned in previous posts is a better and more efficent alternative to the method you mention, my personal preference is the XML conversion with a nested Replace.

    A general statement like this is more dangerous than the use of cursors. 😛

    Check out http://sqlblog.com/blogs/hugo_kornelis/archive/2007/11/21/curious-cursor-optimization-options.aspx. While specialised in its case it counteracts the above general statement and thus makes it invalid. Sorry.

  • Knut Boehnert (9/28/2012)


    A general statement like this is more dangerous than the use of cursors. 😛

    Check out http://sqlblog.com/blogs/hugo_kornelis/archive/2007/11/21/curious-cursor-optimization-options.aspx. While specialised in its case it counteracts the above general statement and thus makes it invalid. Sorry.

    As the author of the blog post you link to, I must contradict you.

    That blog post does describe how you can get the best performance out of cursors. But it also warns (and supports, by giving performance figures) that set-based SQL is lots, lots, and lots faster. Some relevant quotes fro that blog post:

    "The best way to optimize performance of a cursor is, of course, to rip it out and replace it with set-based logic."

    (...)

    "With these options, execution time went down (...) to 3.3 to 3.4 seconds. Of course, none of those come even close to the 0.2 seconds of the set-based equivalent for this test case"

    (...)

    "If you have to optimize a cursor for performance, keep the following considerations in mind:

    1. Always try to replace the cursor by a set-based equivalent first. If you fail to see how, do not hesitate to ask (...)"

    Yes, there are cases where a cursor is indeed the fastest (or, should I say, least slow) option.

    No, concatenating strings from a result set is not one of them. There are better and faster ways to do this.

    No, the method presented in this article is not the better and faster way, because it is undocumented, unsupported, and known to potentially return incorrect results. I must say that I am really disappointed to see this article being rerun, as that suggests that SQLServerCentral.com supports this method.

    But the FOR XML method (presented in various posts in this topic) is documented and supported, and a whole lot faster than even the best optimized cursor.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I always use the stuff function to make a string array seperated by a delimeter:

    declare @mystatus as table (stat_id int identity(1,1), stat_desc varchar(25) null);

    insert @mystatus

    select 'Active'

    union all select 'Onhold'

    union all select 'Disabled'

    union all select 'Closed'

    -- stuff values into array

    declare @my_stat_list varchar(250)

    set @my_stat_list = stuff((select /*distinct*/ ',' + stat_desc from @mystatus order by stat_desc for xml path ('')),1,1,'')

    select @my_stat_list

  • I feel this could be done using stuff and for xml(path).We need not trim the string in the end.

  • /*-------STUFF approach -----------------*/

    SELECT top(1)

    STUFF((

    SELECT ', ' + CAST([StatusDesc] AS VARCHAR(100)) FROM MyStatus FOR XML PATH (''))

    ,1,2,'') AS NameValues

    FROM MyStatus Results

    /*------- end STUFF approach -----------------*/

  • Hugo Kornelis (3/21/2011)


    stefan.hoffmann (3/21/2011)


    hi Hugo,

    I'd like to disagree. While the last example uses only @var = column the explanation above it clearly states:

    "If a SELECT statement returns more than one row and the variable references a nonscalar expression, the variable is set to the value returned for the expression in the last row of the result set. For example, in the following batch @EmpIDVariable is set to the BusinessEntityID value of the last row returned, which is 1:"

    "references a nonscalar expression" should describe exactly our SELECT @var = @var + column FROM table ORDER BY columnOrExpression scenario.

    Or is there any subtle interpretation I don't get here, maybe my English is not sufficient enough...

    Hi Stefan,

    I must admit that it is not as clear-cut as I thought it was. I think I remember that there was an explicit disclaimer for the @var = @var + col case, but I can't find it anymore - maybe I am confisunng this with the equally dangerous SET @var = col = @var + othercol extension to the UPDATE statement.

    Basically, if you take the wording very literally, then yes you are right: the variable should be set to the expression in the last row. And if the column value happens to be 'A' for the last row, then the value of @var after SELECT @var = @var + col FROM SomeTable should be set to @var + 'A'. But how does this help us define the expected output? What value for @var has to be used by SQL Server when processing this last row? The value that resulted from the second to last row? The value @var had before the query was started? Or even something else (imagine a parallel plan that uses three processors to process the three partitions of partitioned table SomeTable).

    Anyway, I did find another useful link - a Microsoft Knowledge Base article that excplicitly warns that the results of this kind of queries ("aggregate concatenation queries") are undefined - see especially the first sentence after the "Cause" heading: "The correct behavior for an aggregate concatenation query is undefined."

    I agree with Hugo.

    Using the [improper] technique of SELECTing a column and performing an operation on it such as building a comma delimited string will yield unpredictable[/i] results! I have been burnt by this! Try it on the contents of a table variable.

    I have also used the XML method to create comma-delimited values with partial success. If the source data does not contain any invalid XML characters and you will accept encoded values for certain characters in the final result (e.g., <, >, &, etc.) then it does work.

    So use the proper coding technique. Use a cursor. Otherwise you are just coding a "land mine" / "time bomb" that will have to be fixed later. I know, as I've had to fix them.


    [font="Arial Narrow"](PHB) I think we should build an SQL database. (Dilbert) What color do you want that database? (PHB) I think mauve has the most RAM.[/font]

  • Hi Suresh,

    Thanks for your post but there are some possible problems if you start to use ORDER BY.

    See

    http://stackoverflow.com/questions/9938853/order-by-when-concating-a-varchar-does-not-work-as-expected?rq=1

    for example

    DECLARE @MyStatusList VARCHAR(MAX)

    SET @MyStatusList = ''

    SELECT @MyStatusList = @MyStatusList + CONVERT(VARCHAR(20), Field)

    FROM Table1

    -- ORDER BY LEN(Field) DESC, Field ASC

    SELECT @MyStatusList

    will give an unexpectedly different result from

    DECLARE @MyStatusList VARCHAR(MAX)

    SET @MyStatusList = ''

    SELECT @MyStatusList = @MyStatusList + CONVERT(VARCHAR(20), Field)

    FROM Table1

    ORDER BY LEN(Field) DESC, Field ASC

    SELECT @MyStatusList

    Steve

  • You need to remember that string manipulation in windows is horribly slow. I wrote an article on creating delimited lists using the FOR XML in 2011 here[/url]. Using this method, I've seen a process that took 30 minutes go to < 1 second... can't beat that kind of boost.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • The article posted was very nice, and we used that technique for years, and it was great, and definitely cut down on the amount of code necessary. Thank you to the author. I don't think that technique is documented in SQL Server.

    We found that we often needed this type of ability (i.e. string aggregation) on the fly, or within a view, to return a comma separated list of values, for every row.

    We used to write functions to handle that, but it was terribly inefficient b/c it would essentially query once per row, within the overall query.

    Oracle introduced user defined aggregates a while ago, and Microsoft added support for them via the CLR recently. It's awesome. If you haven't used it, you're missing out. It's one of my favorite new features of SQL Server in the past many years.

    It allows you to use your function just like you would a normal aggregate, either in an ad-hoc query or a view, like this...

    select

    some_field,

    dbo.collect(some_other_field, ', ') as some_other_field_list

    from

    some_table

    group by

    some_field

    One of the natural advantages of this approach, aside from performance, is that any complicated filtering you had going in your from and where clause is automatically involved in what is aggregated/collected. Whereas, if you had to call a function in the select, all you could do was pass some kind of id to try to simulate the context. That's hard, clunky, and sometimes actually 100% impossible.

    Here is the code to our current version of what we call 'collect', my new best friend.

    Beats the pants off of the other methods. We've found we need to enhance it some, to handle collecting other things, like dates, and numbers, but for now, this version works well.

    One thing, we did find that occasionally it will get messed up if you don't set the max degrees of parallelism to 1 (i.e. option maxdop(1)) at the end of you query. We're going to try to re-write it to deal with parallel query. It's a little complicated but can be done.

    Anyway, here's the version we have an use, which can be easily compiled into a DLL and then loaded into your database as an assembly. Can post the surrounding information if anyone is interested. You do need a little wrapper aggregate function in T-SQL as well.

    using System;

    using System.Collections.Generic;

    using System.Data.SqlTypes;

    using System.IO;

    using System.Linq;

    using Microsoft.SqlServer.Server;

    namespace WitsCLR.UserDefinedAggregates

    {

    [Serializable]

    [SqlUserDefinedAggregate

    (

    Format.UserDefined,

    MaxByteSize = -1,

    IsInvariantToDuplicates = true,

    IsInvariantToNulls = true,

    IsInvariantToOrder = false,

    IsNullIfEmpty = true

    )

    ]

    // ***************************************************************************

    // * public class Collect : IBinarySerialize

    // ***************************************************************************

    public class Collect : IBinarySerialize

    {

    private string delimiter;

    private List<string> result;

    // ***************************************************************************

    // * public void Init

    // ***************************************************************************

    public void Init()

    {

    this.delimiter = "";

    this.result = new List<string>();

    }

    // ***************************************************************************

    // * public void Accumulate

    // ***************************************************************************

    public void Accumulate(SqlString expression, string delimiter)

    {

    if ((expression.IsNull) || (delimiter == null))

    {

    return;

    }

    this.delimiter = delimiter;

    this.result.Add(expression.Value);

    }

    // ***************************************************************************

    // * public void Merge

    // ***************************************************************************

    public void Merge(Collect other)

    {

    this.result.AddRange(other.result.ToArray());

    }

    // ***************************************************************************

    // * public string Terminate

    // ***************************************************************************

    public string Terminate()

    {

    if (this.result.Count == 0)

    {

    return null;

    }

    IEnumerable<string> distinctResult = this.result.Distinct();

    this.result = distinctResult.ToList<string>();

    this.result.Sort();

    return string.Join(this.delimiter, this.result.ToArray());

    }

    // ***************************************************************************

    // * public void Read

    // ***************************************************************************

    public void Read(BinaryReader r)

    {

    this.delimiter = r.ReadString();

    int itemCount = r.ReadInt32();

    this.result = new List<string>(itemCount);

    for (int i = 0; i < itemCount; i++)

    {

    this.result.Add(r.ReadString());

    }

    }

    // ***************************************************************************

    // * public void Write

    // ***************************************************************************

    public void Write(BinaryWriter w)

    {

    w.Write(this.delimiter);

    w.Write(this.result.Count);

    foreach (string s in this.result)

    {

    w.Write(s);

    }

    }

    }

    }

    And the T-SQL part...

    create aggregate collect

    (

    @expression nvarchar(max),

    @delimiter nvarchar(max)

    )

    returns nvarchar(max)

    external name WitsCLR.[WitsCLR.UserDefinedAggregates.Collect]

    go

    Hope this helps.

    Ryan

  • select STUFF(

    (

    SELECT ',' + StatusDesc

    FROM MyStatusList

    FOR XML PATH('')

    ORDER BY Status_Id

    ),

    1,

    1,

    ''

    )

  • I tried each of the queries on the article and could only get the cursor to work. Other than that the coalesce by josef worked as well. Did anyone else have this issue?

  • PS

    Sorry, this time including some sample data

    CREATE TABLE [dbo].[MyStatus](

    [Status_Id] [int] IDENTITY(1,1) PRIMARY KEY NOT NULL,

    [StatusDesc] [varchar](25) NULL,

    )

    INSERT INTO MyStatus VALUES ('Active')

    INSERT INTO MyStatus VALUES ('OnHold')

    INSERT INTO MyStatus VALUES ('Disabled')

    INSERT INTO MyStatus VALUES ('Closed')

    DECLARE @MyStatusList VARCHAR(MAX)

    SET @MyStatusList = ''

    SELECT @MyStatusList = @MyStatusList + CONVERT(VARCHAR(20), StatusDesc)

    FROM MyStatus

    -- ORDER BY LEN(StatusDesc) DESC, StatusDesc ASC

    SELECT @MyStatusList

    -- will give an unexpectedly different result if you uncomment the ORDER BY

  • This is called a Group_Concat in MySQL. SQL Server does not provide such functionality. It has to be mimicked with the @var method you describe.

    Here is a similar article on the technique: http://databases.aspfaq.com/general/how-do-i-concatenate-strings-from-a-column-into-a-single-row.html

  • A different approach, with its own pros and cons, is to use a CLR aggregate function to generate the comma-separated list. Microsoft provides the "Concatenate" function for this purpose in its StringUtilities CLR code sample. This method requires adding an assembly to the database, which may not be possible in some circumstances.

Viewing 15 posts - 61 through 75 (of 87 total)

You must be logged in to reply to this topic. Login to reply