SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Creating a comma-separated list (SQL Spackle)


Creating a comma-separated list (SQL Spackle)

Author
Message
cs_troyk
cs_troyk
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1712 Visits: 969
opc.three (11/8/2012)
cs_troyk (10/29/2012)
Not that I advocate switching to a "toy" RDBMS, but MySQL does have the GROUP_CONCAT operator which makes this type of thing a little more elegant. It's something to be aware of in case you ever need to migrate from a MySQL db.

-TroyK

MySQL's GROUP_CONCAT() function implements a semantically different piece of functionality than what Wayne has demonstrated in the article. Notice that the query in the article is not an aggregate, i.e. no GROUP BY clause.


Take a look at the CTE from the article, and consider the effect of the "DISTINCT" keyword.

Here's the MySQL solution to the comma-separated list problem:

SELECT AccountNumber, GROUP_CONCAT(Value ORDER BY Value) AS 'CommaList'
FROM T
GROUP BY AccountNumber
ORDER BY AccountNumber;

It produces the same output as the T-SQL solution.

-TroyK



Orlando Colamatteo
Orlando Colamatteo
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15033 Visits: 14396
cs_troyk (11/8/2012)
opc.three (11/8/2012)
cs_troyk (10/29/2012)
Not that I advocate switching to a "toy" RDBMS, but MySQL does have the GROUP_CONCAT operator which makes this type of thing a little more elegant. It's something to be aware of in case you ever need to migrate from a MySQL db.

-TroyK

MySQL's GROUP_CONCAT() function implements a semantically different piece of functionality than what Wayne has demonstrated in the article. Notice that the query in the article is not an aggregate, i.e. no GROUP BY clause.


Take a look at the CTE from the article, and consider the effect of the "DISTINCT" keyword.

Here's the MySQL solution to the comma-separated list problem:

SELECT AccountNumber, GROUP_CONCAT(Value ORDER BY Value) AS 'CommaList'
FROM T
GROUP BY AccountNumber
ORDER BY AccountNumber;

It produces the same output as the T-SQL solution.

-TroyK

Not sure what you're getting at when you talk about DISTINCT, whether you mean in the outer query or the correlated subquery, but if the subquery you would need to provide DISTINCT when using GROUP_CONCAT() as well, if duplicate values were a concern:

SELECT AccountNumber, GROUP_CONCAT(DISTINCT Value ORDER BY Value) AS 'CommaList'
FROM T
GROUP BY AccountNumber
ORDER BY AccountNumber;

If the outer query then yes, I agree, an aggregate may be a better option. I am not arguing about the likeness of the results however. All I am pointing out is that there is a difference in implementation, semantical maybe, but there is a penalty for using GROUP BY when it is not needed when compared to using a correlated subquery as Wayne presents it.

If GROUP_CONCAT() existed on SQL Server the XML technique from the article would almost certainly outperform it so while it may seem like a fair comparison from a query results point of view it's really apples and oranges under the hood.

Now, from a syntax point of view, when the outer query must be distinct, I like the GROUP_CONCAT() method which is why I wrote GROUP_CONCAT() for SQL Server by leveraging the SQLCLR but it is not the best choice for all scenarios.

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
cs_troyk
cs_troyk
SSCommitted
SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)SSCommitted (1.7K reputation)

Group: General Forum Members
Points: 1712 Visits: 969
opc.three (11/8/2012)
cs_troyk (11/8/2012)
opc.three (11/8/2012)
cs_troyk (10/29/2012)
Not that I advocate switching to a "toy" RDBMS, but MySQL does have the GROUP_CONCAT operator which makes this type of thing a little more elegant. It's something to be aware of in case you ever need to migrate from a MySQL db.

-TroyK

MySQL's GROUP_CONCAT() function implements a semantically different piece of functionality than what Wayne has demonstrated in the article. Notice that the query in the article is not an aggregate, i.e. no GROUP BY clause.


Take a look at the CTE from the article, and consider the effect of the "DISTINCT" keyword.

Here's the MySQL solution to the comma-separated list problem:

SELECT AccountNumber, GROUP_CONCAT(Value ORDER BY Value) AS 'CommaList'
FROM T
GROUP BY AccountNumber
ORDER BY AccountNumber;

It produces the same output as the T-SQL solution.

-TroyK

Not sure what you're getting at when you talk about DISTINCT, whether you mean in the outer query or the correlated subquery, but if the subquery you would need to provide DISTINCT when using GROUP_CONCAT() as well, if duplicate values were a concern:

[code="sql"SELECT AccountNumber, GROUP_CONCAT(DISTINCT Value ORDER BY Value) AS 'CommaList'
FROM T
GROUP BY AccountNumber
ORDER BY AccountNumber;[/code]

If the outer query then yes, I agree, an aggregate may be a better option. I am not arguing about the likeness of the results however. All I am pointing out is that there is a difference in implementation, semantical maybe, but there is a penalty for using GROUP BY when it is not needed when compared to using a correlated subquery as Wayne presents it.

If GROUP_CONCAT() existed on SQL Server the XML technique from the article would almost certainly outperform it so while it may seem like a fair comparison from a query results point of view it's really apples and oranges under the hood.

Now, from a syntax point of view, when the outer query must be distinct, I like the GROUP_CONCAT() method which is why I wrote GROUP_CONCAT() for SQL Server by leveraging the SQLCLR but it is not the best choice for all scenarios.


My point is that the solution presented in the article is the equivalent to a solution using GROUP BY, so saying that there's no GROUP BY clause is only accurate when talking about the syntax. The query could be rewritten like this, and it's the same query from the optimizer's perspective:


SELECT AccountNumber,
CommaList = STUFF((
SELECT ',' + Value
FROM #TestData
WHERE AccountNumber = t.AccountNumber
ORDER BY Value
FOR XML PATH(''),
TYPE).value('.','varchar(max)'),1,1,'')
FROM #TestData t
GROUP BY AccountNumber
ORDER BY AccountNumber;



This produces the same query plan as the CTE version, and has the same IO profile, so it is incorrect to assert some penalty for using GROUP BY. I'm not sure how you arrive at the conclusion that the XML version would outperform GROUP_CONCAT if the latter were implemented in MS's product. Is this based on your results from implementing it using CLR?

-TroyK



Orlando Colamatteo
Orlando Colamatteo
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15033 Visits: 14396
cs_troyk (11/9/2012)
opc.three (11/8/2012)
cs_troyk (11/8/2012)
opc.three (11/8/2012)
cs_troyk (10/29/2012)
Not that I advocate switching to a "toy" RDBMS, but MySQL does have the GROUP_CONCAT operator which makes this type of thing a little more elegant. It's something to be aware of in case you ever need to migrate from a MySQL db.

-TroyK

MySQL's GROUP_CONCAT() function implements a semantically different piece of functionality than what Wayne has demonstrated in the article. Notice that the query in the article is not an aggregate, i.e. no GROUP BY clause.


Take a look at the CTE from the article, and consider the effect of the "DISTINCT" keyword.

Here's the MySQL solution to the comma-separated list problem:

SELECT AccountNumber, GROUP_CONCAT(Value ORDER BY Value) AS 'CommaList'
FROM T
GROUP BY AccountNumber
ORDER BY AccountNumber;

It produces the same output as the T-SQL solution.

-TroyK

Not sure what you're getting at when you talk about DISTINCT, whether you mean in the outer query or the correlated subquery, but if the subquery you would need to provide DISTINCT when using GROUP_CONCAT() as well, if duplicate values were a concern:

[code="sql"SELECT AccountNumber, GROUP_CONCAT(DISTINCT Value ORDER BY Value) AS 'CommaList'
FROM T
GROUP BY AccountNumber
ORDER BY AccountNumber;[/code]

If the outer query then yes, I agree, an aggregate may be a better option. I am not arguing about the likeness of the results however. All I am pointing out is that there is a difference in implementation, semantical maybe, but there is a penalty for using GROUP BY when it is not needed when compared to using a correlated subquery as Wayne presents it.

If GROUP_CONCAT() existed on SQL Server the XML technique from the article would almost certainly outperform it so while it may seem like a fair comparison from a query results point of view it's really apples and oranges under the hood.

Now, from a syntax point of view, when the outer query must be distinct, I like the GROUP_CONCAT() method which is why I wrote GROUP_CONCAT() for SQL Server by leveraging the SQLCLR but it is not the best choice for all scenarios.


My point is that the solution presented in the article is the equivalent to a solution using GROUP BY, so saying that there's no GROUP BY clause is only accurate when talking about the syntax. The query could be rewritten like this, and it's the same query from the optimizer's perspective:


SELECT AccountNumber,
CommaList = STUFF((
SELECT ',' + Value
FROM #TestData
WHERE AccountNumber = t.AccountNumber
ORDER BY Value
FOR XML PATH(''),
TYPE).value('.','varchar(max)'),1,1,'')
FROM #TestData t
GROUP BY AccountNumber
ORDER BY AccountNumber;



This produces the same query plan as the CTE version, and has the same IO profile, so it is incorrect to assert some penalty for using GROUP BY. I'm not sure how you arrive at the conclusion that the XML version would outperform GROUP_CONCAT if the latter were implemented in MS's product. Is this based on your results from implementing it using CLR?

-TroyK

Yes they are. I am now on the same page as far as which DISTINCT you were referring too. If the dataset you're aggregating on is not unique and you do not need to apply DISTINCT then SQL will not need to sort or de-duplicate the key column to aggregate the results, whereas the GROUP BY always will (unless the optimizer can ignore it due to the presence of a unique index). That is where the penalty I am referring too is introduced.

Here is the equivalent query using my SQLCLR:

SELECT AccountNumber,
dbo.GROUP_CONCAT(Value) AS CommaList
FROM #TestData t
GROUP BY AccountNumber
ORDER BY AccountNumber;



In my testing (test scripts are included in the CodePlex download) the SQLCLR object is on par with the XML TYPE method shown in the article in terms of performance. In some test cases the XML performs best, in others the SQL CLR performs best. The +- is roughly 10% in either direction so I am comfortable saying they are "comparable" for general use.

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Neha05
Neha05
Mr or Mrs. 500
Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)

Group: General Forum Members
Points: 524 Visits: 60
Nice article.
Neha05
Neha05
Mr or Mrs. 500
Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)Mr or Mrs. 500 (524 reputation)

Group: General Forum Members
Points: 524 Visits: 60
Nice article!
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)

Group: General Forum Members
Points: 87140 Visits: 41113
glock 71629 (1/12/2011)
This is exactly the situation where I use PowerShell frequently. I am able to export data from whatever complex query. Idea is something like this:

Invoke-Sqlcmd -ServerInstance myServer -Database MyDb -Query "SELECT * FROM SomeTable" | Export-Csv ./result.csv -NoTypeInformation



In -Query can be any T-SQL code and results are exported natively to CSV file. I work as ConfigMgr admin and this is the way how you can really easily receive your data.

David


I know this post is more than 2 years old but curiosity finally got the best of me. "Everyone" keeps recommending that PowerShell be used for these types of things but not once have I seen them identify how you could use PowerShell in a stored procedure. ;-) Also, why even bother with PowerShell when this is so easily done without it and it works in a stored procedure or a function without having to somehow get to the command line to do it?

Yes, if you're trying to create a CSV file, the PowerShell solution works just fine but why would you use it instead of T-SQL if the target was also T-SQL or just a result set to an app?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Orlando Colamatteo
Orlando Colamatteo
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15033 Visits: 14396
...when this is so easily done without it and it works in a stored procedure or a function without having to somehow get to the command line to do it?

How would you do it in T-SQL?

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)SSC Guru (87K reputation)

Group: General Forum Members
Points: 87140 Visits: 41113
opc.three (5/4/2013)
...when this is so easily done without it and it works in a stored procedure or a function without having to somehow get to the command line to do it?

How would you do it in T-SQL?


If you're just outputing a result set (which is the object of the article... not exporting data), then just like the article does with the [text] kicker. I'm not sure why anyone would even bring up PowerShell in such a situation.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Orlando Colamatteo
Orlando Colamatteo
SSCoach
SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)SSCoach (15K reputation)

Group: General Forum Members
Points: 15033 Visits: 14396
I think Wayne already called out that the comment was misplaced.

http://www.sqlservercentral.com/Forums/FindPost1046450.aspx

I thought you were taking things in a different direction.

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search