Comma separated list

Question

Comma separated list

PSB

SSChampion

Points: 13195
More actions
January 22, 2017 at 9:18 pm

#334275

Hi,
I need a comma separated list .
CREATE TABLE #CommaSeparated
(
ID INT,
Names VARCHAR(3000)
)
INSERT INTO #CommaSeparated ( ID,Names)
SELECT 1,'John Doe' UNION
SELECT 1,' Elizabeth Jackson' UNION
SELECT 1, 'Natalie Reads' UNION
SELECT 2, 'Henny B.' UNION
SELECT 2,'Koyal B.'
SELECT * FROM #CommaSeparated
SELECT DISTINCT
ID,
stuff((
SELECT ',' + t.[Names]
FROM #CommaSeparated t
WHERE t.ID = t1.ID
ORDER BY t.[Names]
FOR xml path('')
),1,1,'') as SharedList
FROM #CommaSeparated t1
GROUP BY ID
SELECT * FROM #CommaSeparated
--Desire list
ID SharedList
1 Elizabeth Brooks,John Doe,Natalie Reads
2 Henny B.,Koyal B.
However in the original query I am getting duplicates like
1 Elizabeth Brooks,Elizabeth Brooks,John Doe,John Doe,Natalie Reads,Natalie Reads
Thanks,
PSB

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

DesNorton SSC-Insane Points: 24285 More actions · Answer 1

PSB - Sunday, January 22, 2017 9:18 PM
Hi,
I need a comma separated list .
CREATE TABLE #CommaSeparated
(
ID INT,
Names VARCHAR(3000)
)
INSERT INTO #CommaSeparated ( ID,Names)
SELECT 1,'John Doe' UNION
SELECT 1,' Elizabeth Jackson' UNION
SELECT 1, 'Natalie Reads' UNION
SELECT 2, 'Henny B.' UNION
SELECT 2,'Koyal B.'
SELECT * FROM #CommaSeparated
SELECT DISTINCT
ID,
stuff((
SELECT ',' + t.[Names]
FROM #CommaSeparated t
WHERE t.ID = t1.ID
ORDER BY t.[Names]
FOR xml path('')
),1,1,'') as SharedList
FROM #CommaSeparated t1
GROUP BY ID
SELECT * FROM #CommaSeparated
--Desire list
ID SharedList
1 Elizabeth Brooks,John Doe,Natalie Reads
2 Henny B.,Koyal B.
However in the original query I am getting duplicates like
1 Elizabeth Brooks,Elizabeth Brooks,John Doe,John Doe,Natalie Reads,Natalie Reads
Thanks,
PSB

In your code, you have a GROUP BY, which means that you don't need the DISTINCT.
However, your data appears to have duplicates - This can be simulated by executing this part of your sample code twice
INSERT INTO #CommaSeparated ( ID,Names) SELECT 1, 'John Doe' UNION SELECT 1,' Elizabeth Jackson' UNION SELECT 1, 'Natalie Reads' UNION SELECT 2, 'Henny B.' UNION SELECT 2, 'Koyal B.';

Now, you need to de-duplicate your data before creating the CSV string.
SELECT ID, STUFF(( SELECT ',' + t.[Names] FROM (SELECT DISTINCT t2.[Names] FROM #CommaSeparated t2 WHERE t2.ID = t1.ID ) AS t ORDER BY t.[Names] FOR xml path('') ),1,1,'') as SharedList FROM #CommaSeparated t1 GROUP BY ID;

How to post data/code on a forum to get the best help.

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 2

Just a quick thought, you should add the TYPE directive and the value method if there are any characters within the concatenated text that would be encoded in the FOR XML statement such as & < > etc.
😎

SELECT ID, STUFF(( SELECT ',' + t.[Names] FROM (SELECT DISTINCT t2.[Names] FROM #CommaSeparated t2 WHERE t2.ID = t1.ID ) AS t ORDER BY t.[Names] FOR xml path(''),TYPE ).value('(./text())[1]','VARCHAR(MAX)'),1,1,'') as SharedList FROM #CommaSeparated t1 GROUP BY ID;

DesNorton SSC-Insane Points: 24285 More actions · Answer 3

Eirikur Eiriksson - Monday, January 23, 2017 2:43 AM
Just a quick thought, you should add the TYPE directive and the value method if there are any characters within the concatenated text that would be encoded in the FOR XML statement such as & < > etc.
😎
SELECT ID, STUFF(( SELECT ',' + t.[Names] FROM (SELECT DISTINCT t2.[Names] FROM #CommaSeparated t2 WHERE t2.ID = t1.ID ) AS t ORDER BY t.[Names] FOR xml path(''),TYPE ).value('(./text())[1]','VARCHAR(MAX)'),1,1,'') as SharedList FROM #CommaSeparated t1 GROUP BY ID;

Thank you Eirikur. Nice tip.

How to post data/code on a forum to get the best help.

PSB SSChampion Points: 13195 More actions · Answer 4

PSB

SSChampion

Points: 13195

January 23, 2017 at 6:53 am

#1924389

Thanks All.

Sergiy SSC Guru Points: 110207 More actions · Answer 5

Try not to use DISTINCT.
Ever.
Just forget this keyword exists

As for this query - keep in mind, that "FOR XML" query is a separate correlated subquery, grouping applied to the external query does not apply to it.
it needs its own "GROUP BY", if you want unique values.

SELECT ID, stuff(( SELECT ',' + t.[Names] FROM #CommaSeparated t WHERE t.ID = t1.ID GROUP BY t.[Names] ORDER BY t.[Names] FOR xml path(''),TYPE ).value('.','VARCHAR(MAX)'),1,1,'') as SharedList FROM #CommaSeparated t1 GROUP BY ID

_____________
Code for TallyGenerator

ScottPletcher SSC Guru Points: 100952 More actions · Answer 6

Sergiy - Wednesday, January 25, 2017 3:32 PM

Try not to use DISTINCT.
Ever.
Just forget this keyword exists

What a bizarre claim. DISTINCT is more efficient than GROUP BY in given situations, so you absolutely shouldn't forget it.

SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Sergiy SSC Guru Points: 110207 More actions · Answer 7

I wonder - what would be those "given situations"?

_____________
Code for TallyGenerator

Alan Burstein SSC Guru Points: 61136 More actions · Answer 8

Sergiy - Wednesday, January 25, 2017 3:32 PM
Try not to use DISTINCT.
Ever.
Just forget this keyword exists

As for this query - keep in mind, that "FOR XML" query is a separate correlated subquery, grouping applied to the external query does not apply to it.
it needs its own "GROUP BY", if you want unique values.
SELECT ID, stuff(( SELECT ',' + t.[Names] FROM #CommaSeparated t WHERE t.ID = t1.ID GROUP BY t.[Names] ORDER BY t.[Names] FOR xml path(''),TYPE ).value('.','VARCHAR(MAX)'),1,1,'') as SharedList FROM #CommaSeparated t1 GROUP BY ID

What's wrong with DISTINCT?

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Sergiy SSC Guru Points: 110207 More actions · Answer 9

Alan.B - Wednesday, January 25, 2017 9:14 PM
Sergiy - Wednesday, January 25, 2017 3:32 PM
Try not to use DISTINCT.
Ever.
Just forget this keyword exists

As for this query - keep in mind, that "FOR XML" query is a separate correlated subquery, grouping applied to the external query does not apply to it.
it needs its own "GROUP BY", if you want unique values.
SELECT ID, stuff(( SELECT ',' + t.[Names] FROM #CommaSeparated t WHERE t.ID = t1.ID GROUP BY t.[Names] ORDER BY t.[Names] FOR xml path(''),TYPE ).value('.','VARCHAR(MAX)'),1,1,'') as SharedList FROM #CommaSeparated t1 GROUP BY ID
What's wrong with DISTINCT?

It works with the set generated by SELECT.

2 queries doing the same thing:

SET STATISTICS IO ON

SELECT DISTINCT ID, stuff(( SELECT ',' + t.[Names] FROM #CommaSeparated t WHERE t.ID = t1.ID GROUP BY t.[Names] ORDER BY t.[Names] FOR xml path('') ),1,1,'') as SharedList FROM #CommaSeparated t1

SELECT ID, stuff(( SELECT ',' + t.[Names] FROM #CommaSeparated t WHERE t.ID = t1.ID GROUP BY t.[Names] ORDER BY t.[Names] FOR xml path('') ),1,1,'') as SharedList FROM #CommaSeparated t1 GROUP BY ID

SET STATISTICS IO OFF

Here is the difference:
Table 'Worktable'. Scan count 9, logical reads 23, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#CommaSeparated_____________________________________________________________________________________________________000000000415'. Scan count 3, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table '#CommaSeparated_____________________________________________________________________________________________________000000000415'. Scan count 3, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

_____________
Code for TallyGenerator

Jeff Moden SSC Guru Points: 1003894 More actions · Answer 10

Alan.B - Wednesday, January 25, 2017 9:14 PM
Sergiy - Wednesday, January 25, 2017 3:32 PM
Try not to use DISTINCT.
Ever.
Just forget this keyword exists

As for this query - keep in mind, that "FOR XML" query is a separate correlated subquery, grouping applied to the external query does not apply to it.
it needs its own "GROUP BY", if you want unique values.
SELECT ID, stuff(( SELECT ',' + t.[Names] FROM #CommaSeparated t WHERE t.ID = t1.ID GROUP BY t.[Names] ORDER BY t.[Names] FOR xml path(''),TYPE ).value('.','VARCHAR(MAX)'),1,1,'') as SharedList FROM #CommaSeparated t1 GROUP BY ID
What's wrong with DISTINCT?

There's nothing wrong with DISTINCT. 😉 There's a lot wrong with the way people use it because they don't think about the differences between DISTINCT and GROUP BY. See the following article for a very old classic example... http://www.sqlservercentral.com/articles/Test+Data/61572/

Sergiy did hit it out of the park with his example above. The article explains the difference.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)