Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

SQL Distinct comma delimited list Expand / Collapse
Author
Message
Posted Thursday, July 25, 2013 1:48 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Monday, September 15, 2014 4:57 AM
Points: 802, Visits: 717
The article is certainly contradictive, but I've decided to let it stop with the first sentence:

The correct behavior for an aggregate concatenation query is undefined.

The article then goes out of its way to present scenarios where it may work after all. You should keep in mind that the article was originally published when SQL 2000 was the most recent alternative, and there was not any alternatives.

It's important to understand that just because something works in one specific test, that is no guarantee that it will always work, unless there is documentation to say so. And here the documentation clearly says "undefined".

From SQL 2005 there is FOR XML PATH('') which has a well-defined behaviour. Nevermind that the syntax is clunky and non-intuitive. But this is the way to go if you need to build concatenated lists. (As long as you are not dealing with binary data.)


Erland Sommarskog, SQL Server MVP, www.sommarskog.se
Post #1477372
Posted Thursday, July 25, 2013 2:09 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, September 17, 2014 2:53 AM
Points: 536, Visits: 250
The discussion is going pretty well and the different approaches mentioned are good. One thing I would like to point out here that all the solutions revolve around the same principle and follow the logical query processing phase in sql server.

1. FROM
2. ON
3. OUTER
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10 ORDER BY
11. TOP

Pay attention, here GROUP BY comes before SELECT and DISTINCT. Of course this will apply independently to sub queries and virtual table expressions.

Following this I don't see any reason why GROUP BY over aggregate concatenation will not work in any scenario. So far I have never encountered an example of the "undefined". If anyone can post an example precisely explaining this nature mentioned in the KB article then it would be helpful for all.
Post #1477384
Posted Thursday, July 25, 2013 2:12 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Monday, September 15, 2014 4:57 AM
Points: 802, Visits: 717
ksatpute123 (7/25/2013)
Following this I don't see any reason why GROUP BY over aggregate concatenation will not work in any scenario. So far I have never encountered an example of the "undefined". If anyone can post an example precisely explaining this nature mentioned in the KB article then it would be helpful for all.


If you never look when you cross the road "because there are never any cars in this area, and it has always worked for me", you will eventually be run over by a car.

And that is the whole gist of it. Microsoft are not giving you any guarantees that it will work, and thus you should not use it.

Examples? Yes, I have encountered cases where the result was only one of the rows in the result set. Repro? No, this was with older versions of SQL Server.


Erland Sommarskog, SQL Server MVP, www.sommarskog.se
Post #1477388
Posted Thursday, July 25, 2013 2:21 AM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, September 17, 2014 2:53 AM
Points: 536, Visits: 250
Even the article focuses discussion around ORDER BY clause. I am not saying that if I have never encountered any issues with it then it is perfect. I want to see a example which precisely explains why it would not work so all of us will know what happens behind the scenes and thus have better understanding of how SQL server works.
Post #1477391
Posted Thursday, July 25, 2013 2:39 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Monday, September 15, 2014 4:57 AM
Points: 802, Visits: 717
Again:

The correct behavior for an aggregate concatenation query is undefined.

The vendor is not giving any guarantees, and this is the key issue.

Up to SQL Server 6.x GROUP BY implied an ORDER BY. Then Hash Aggregates came along and broke that.

Up to SQL 2000, you could use TOP 100 PERCENT ... ORDER BY in a view definition, and a SELECT from the view was always ordered.

That is, what works today, may not work tomorrow, and Microsoft may just shrug their shoulders.

On the other hand, if FOR XML PATH breaks, you can scream "bug" and they will have to fix it.



Erland Sommarskog, SQL Server MVP, www.sommarskog.se
Post #1477395
Posted Thursday, July 25, 2013 3:57 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 5:50 AM
Points: 2,856, Visits: 5,124
Alan.B (7/24/2013)
Erland Sommarskog (7/24/2013)
Alan.B (7/24/2013)
Why is the SELECT
@x=@x+ method not guaranteed to work?


Why would it?

See this KB article Pay particular attention to the first sentence under Cause.


I say it would work based on the example I posted (which works). It produces the exact same plan and answer (except for the leading comma) as what Chris Posted which I believe is guaranteed to work. I need to read the article a little more (as well as this one) but I think it should work just fine.
...


As I have pointed out in my previous post, one of the reasons why this concatenation may not work properly is a query parallelising. You cannot guarantee that the query plan stays the same for ever (until you inforce it with hints).
And there is another one: I have seen it happened! This style query did work fine for couple of years, then occasionally started to return unexepected results. It was not easy to find what was going wrong, due to the problem was not easy to reproduce.
And again. If query of this styler is very basic, then probability of this problem is very low. As soon as such query is complecated by JOIN's or GROUP BY - probabilty of this query to return unexpected results grows.
In SQL2008 MS introduces compound operators into T-SQL eg +=, -=, *= etc. Check BoL, even for them, the only sample of code MS added into BoL is a setting values using SELECT or SET - not SELECT from Table as they are not replacement for aggregate functions.
Why not to sum values just by doing SELECT @v += Column FROM table?
The reason is exactly the same. It's not guaranteed that it will aggregate all values!


_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!"
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help
Post #1477416
Posted Thursday, July 25, 2013 12:30 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: 2 days ago @ 8:06 PM
Points: 581, Visits: 2,711
Eugene Elutin (7/25/2013)
Alan.B (7/24/2013)
Erland Sommarskog (7/24/2013)
Alan.B (7/24/2013)
Why is the SELECT
@x=@x+ method not guaranteed to work?


Why would it?

See this KB article Pay particular attention to the first sentence under Cause.


I say it would work based on the example I posted (which works). It produces the exact same plan and answer (except for the leading comma) as what Chris Posted which I believe is guaranteed to work. I need to read the article a little more (as well as this one) but I think it should work just fine.
...


As I have pointed out in my previous post, one of the reasons why this concatenation may not work properly is a query parallelising. You cannot guarantee that the query plan stays the same for ever (until you inforce it with hints).
And there is another one: I have seen it happened! This style query did work fine for couple of years, then occasionally started to return unexepected results. It was not easy to find what was going wrong, due to the problem was not easy to reproduce.
And again. If query of this styler is very basic, then probability of this problem is very low. As soon as such query is complecated by JOIN's or GROUP BY - probabilty of this query to return unexpected results grows.
In SQL2008 MS introduces compound operators into T-SQL eg +=, -=, *= etc. Check BoL, even for them, the only sample of code MS added into BoL is a setting values using SELECT or SET - not SELECT from Table as they are not replacement for aggregate functions.
Why not to sum values just by doing SELECT @v += Column FROM table?
The reason is exactly the same. It's not guaranteed that it will aggregate all values!


I get it now. Thanks Eugene and Erland!


-- Alan Burstein



Read this article for best practices on asking questions.
Need to split a string? Try this (Jeff Moden)
Need a pattern-based string spitter? Try this (Dwain Camps)
My blog
Post #1477667
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse