SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SQL Distinct comma delimited list


SQL Distinct comma delimited list

Author
Message
Erland Sommarskog
Erland Sommarskog
SSCrazy
SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)

Group: General Forum Members
Points: 2171 Visits: 872
The article is certainly contradictive, but I've decided to let it stop with the first sentence:

The correct behavior for an aggregate concatenation query is undefined.

The article then goes out of its way to present scenarios where it may work after all. You should keep in mind that the article was originally published when SQL 2000 was the most recent alternative, and there was not any alternatives.

It's important to understand that just because something works in one specific test, that is no guarantee that it will always work, unless there is documentation to say so. And here the documentation clearly says "undefined".

From SQL 2005 there is FOR XML PATH('') which has a well-defined behaviour. Nevermind that the syntax is clunky and non-intuitive. But this is the way to go if you need to build concatenated lists. (As long as you are not dealing with binary data.)

Erland Sommarskog, SQL Server MVP, www.sommarskog.se
ksatpute123
ksatpute123
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1424 Visits: 383
The discussion is going pretty well and the different approaches mentioned are good. One thing I would like to point out here that all the solutions revolve around the same principle and follow the logical query processing phase in sql server.

1. FROM
2. ON
3. OUTER
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10 ORDER BY
11. TOP

Pay attention, here GROUP BY comes before SELECT and DISTINCT. Of course this will apply independently to sub queries and virtual table expressions.

Following this I don't see any reason why GROUP BY over aggregate concatenation will not work in any scenario. So far I have never encountered an example of the "undefined". If anyone can post an example precisely explaining this nature mentioned in the KB article then it would be helpful for all.
Erland Sommarskog
Erland Sommarskog
SSCrazy
SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)

Group: General Forum Members
Points: 2171 Visits: 872
ksatpute123 (7/25/2013)
Following this I don't see any reason why GROUP BY over aggregate concatenation will not work in any scenario. So far I have never encountered an example of the "undefined". If anyone can post an example precisely explaining this nature mentioned in the KB article then it would be helpful for all.


If you never look when you cross the road "because there are never any cars in this area, and it has always worked for me", you will eventually be run over by a car.

And that is the whole gist of it. Microsoft are not giving you any guarantees that it will work, and thus you should not use it.

Examples? Yes, I have encountered cases where the result was only one of the rows in the result set. Repro? No, this was with older versions of SQL Server.

Erland Sommarskog, SQL Server MVP, www.sommarskog.se
ksatpute123
ksatpute123
Ten Centuries
Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)Ten Centuries (1.4K reputation)

Group: General Forum Members
Points: 1424 Visits: 383
Even the article focuses discussion around ORDER BY clause. I am not saying that if I have never encountered any issues with it then it is perfect. I want to see a example which precisely explains why it would not work so all of us will know what happens behind the scenes and thus have better understanding of how SQL server works.
Erland Sommarskog
Erland Sommarskog
SSCrazy
SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)SSCrazy (2.2K reputation)

Group: General Forum Members
Points: 2171 Visits: 872
Again:

The correct behavior for an aggregate concatenation query is undefined.

The vendor is not giving any guarantees, and this is the key issue.

Up to SQL Server 6.x GROUP BY implied an ORDER BY. Then Hash Aggregates came along and broke that.

Up to SQL 2000, you could use TOP 100 PERCENT ... ORDER BY in a view definition, and a SELECT from the view was always ordered.

That is, what works today, may not work tomorrow, and Microsoft may just shrug their shoulders.

On the other hand, if FOR XML PATH breaks, you can scream "bug" and they will have to fix it.

Erland Sommarskog, SQL Server MVP, www.sommarskog.se
Eugene Elutin
Eugene Elutin
SSCertifiable
SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)

Group: General Forum Members
Points: 5158 Visits: 5478
Alan.B (7/24/2013)
Erland Sommarskog (7/24/2013)
Alan.B (7/24/2013)
Why is the SELECT
@x=@x+ method not guaranteed to work?


Why would it?

See this KB article Pay particular attention to the first sentence under Cause.


I say it would work based on the example I posted (which works). It produces the exact same plan and answer (except for the leading comma) as what Chris Posted which I believe is guaranteed to work. I need to read the article a little more (as well as this one) but I think it should work just fine.
...


As I have pointed out in my previous post, one of the reasons why this concatenation may not work properly is a query parallelising. You cannot guarantee that the query plan stays the same for ever (until you inforce it with hints).
And there is another one: I have seen it happened! This style query did work fine for couple of years, then occasionally started to return unexepected results. It was not easy to find what was going wrong, due to the problem was not easy to reproduce.
And again. If query of this styler is very basic, then probability of this problem is very low. As soon as such query is complecated by JOIN's or GROUP BY - probabilty of this query to return unexpected results grows.
In SQL2008 MS introduces compound operators into T-SQL eg +=, -=, *= etc. Check BoL, even for them, the only sample of code MS added into BoL is a setting values using SELECT or SET - not SELECT from Table as they are not replacement for aggregate functions.
Why not to sum values just by doing SELECT @v += Column FROM table?
The reason is exactly the same. It's not guaranteed that it will aggregate all values!

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help
Alan.B
Alan.B
SSCertifiable
SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)SSCertifiable (5.3K reputation)

Group: General Forum Members
Points: 5348 Visits: 7736
Eugene Elutin (7/25/2013)
Alan.B (7/24/2013)
Erland Sommarskog (7/24/2013)
Alan.B (7/24/2013)
Why is the SELECT
@x=@x+ method not guaranteed to work?


Why would it?

See this KB article Pay particular attention to the first sentence under Cause.


I say it would work based on the example I posted (which works). It produces the exact same plan and answer (except for the leading comma) as what Chris Posted which I believe is guaranteed to work. I need to read the article a little more (as well as this one) but I think it should work just fine.
...


As I have pointed out in my previous post, one of the reasons why this concatenation may not work properly is a query parallelising. You cannot guarantee that the query plan stays the same for ever (until you inforce it with hints).
And there is another one: I have seen it happened! This style query did work fine for couple of years, then occasionally started to return unexepected results. It was not easy to find what was going wrong, due to the problem was not easy to reproduce.
And again. If query of this styler is very basic, then probability of this problem is very low. As soon as such query is complecated by JOIN's or GROUP BY - probabilty of this query to return unexpected results grows.
In SQL2008 MS introduces compound operators into T-SQL eg +=, -=, *= etc. Check BoL, even for them, the only sample of code MS added into BoL is a setting values using SELECT or SET - not SELECT from Table as they are not replacement for aggregate functions.
Why not to sum values just by doing SELECT @v += Column FROM table?
The reason is exactly the same. It's not guaranteed that it will aggregate all values!


I get it now. Thanks Eugene and Erland!

-- Alan Burstein



Best practices for getting help on SQLServerCentral
Need to split a string? Try DelimitedSplit8K or DelimitedSplit8K_LEAD (SQL 2012+)
Need a pattern-based splitter? Try PatternSplitCM
Need to remove or replace those unwanted characters? Try PatExclude8K and PatReplace8K.

"I can't stress enough the importance of switching from a 'sequential files' mindset to 'set-based' thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code. " -- Itzek Ben-Gan 2001
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search