How to concatenate group of rows

Question

Post reply

How to concatenate group of rows

RZ52

Mr or Mrs. 500

Points: 536
More actions
February 27, 2013 at 6:28 pm

#274683

Hi,
I have a data structure as followed : [Name], [Task] and the data is :
John, task1
John, task2
Joe, task1
Joe, task3
Joe, task4
Jane, task1
Jane, task4
..........
I wonder how to use the 'SELECT' to make the following result :
John, task1;task2
Joe, task1;task3;task4
Jane, task1;task4
.....
Thanks for any help.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

Arjun Sivadasan SSCertifiable Points: 5902 More actions · Answer 1

select

Name, stuff((select ';' + Task from Allocation where Name = a.Name for xml path('') order by Task),1,1,'') [Tasks]

from

Allocation a

group by Name

Wrote on notepad, may have syntax error.

https://sqlroadie.com/

RZ52 Mr or Mrs. 500 Points: 536 More actions · Answer 2

Dear Arjun,

Thanks for guidance. With some minor modification, it works perfectly.

Now, I have another issue, in my [Task] field I have subgroups that is indicated with a specific format like :

[Name] , [Task]

------------------

John , task1:a

John, task2:a

John, task3

John, task4:a

........

What I need is when I concatenate my values, I only need to include those tasks that have subgroup a (:a) and omit those without subgroup (:a). so I need :

[Name] , [Task]

------------------

John , task1:a;task2:a;task4:a

................

How could I achieve that ?

Thanks again

Arjun Sivadasan SSCertifiable Points: 5902 More actions · Answer 3

append charindex(':',task) > 0 to the query?

Not sure of the requirement. If you want to filter only subgroup 'a' and not 'b' or 'c' you may want to check out splitter functions. Just search in SSC. Basically, it's a function which you can use like

isRightHalf(task,':') = 'a'

https://sqlroadie.com/

Jeff Moden SSC Guru Points: 1003869 More actions · Answer 4

RZ52 (2/27/2013)
Hi,
I have a data structure as followed : [Name], [Task] and the data is :
John, task1
John, task2
Joe, task1
Joe, task3
Joe, task4
Jane, task1
Jane, task4
..........
I wonder how to use the 'SELECT' to make the following result :
John, task1;task2
Joe, task1;task3;task4
Jane, task1;task4
.....
Thanks for any help.

Since that represents denormalization almost as bad as XML, I have to ask, why do you want to do this to your data? What is the business requirement?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RZ52 Mr or Mrs. 500 Points: 536 More actions · Answer 5

Dear Jeff,

The above scenario is only a simplified example. My real dataset is about the serotype profile of certain bacterial samples. So, we do some probe tests for certain factors in triplicate manner and the data structure is :

[Sample_ID], [Rep_ID], [Result]

----------------------------

1, 1, O152

1, 2, O2

1, 3, O157

2, 1, O5

2, 2, Negative

2, 3, O1

3, 1, O157

3, 2, O1

3, 3, O1

4, 1, O157

4, 2, Negative

4, 3, O2

5, 1, Negative

5, 2, Negative

5, 3, Negative

...........

Now the I need a profile column in result set with the following conditions :

1) The O15? are the dangerous pathogens, so we only need to profile these factors. Those samples with no O15? factor will be tagged as "Non-pathogen"

2) We don't want to show "Negative" as part of profile except when all three replicates are "Negative" which will be assigned as "Negative"

So the final result needs to be like :

[Sample ID], [Profile]

--------------------------------------------

1, O152;O157

2, Non-pathogen

3, O157

4, O157

5, Negative

..................

I hope I could explain my question.

So, if you have any suggestion or help, I really appreciate

Cadavre SSC-Forever Points: 41690 More actions · Answer 6

So, to make sure I'm clear with what you want.

With this sample data: -

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment;

END;

SELECT [Sample_ID], [Rep_ID], [Result]

INTO #testEnvironment

FROM (VALUES(1, 1, 'O152'),(1, 2, 'O2'),(1, 3, 'O157'),(2, 1, 'O5'),

(2, 2, 'Negative'),(2, 3, 'O1'),(3, 1, 'O157'),(3, 2, 'O1'),

(3, 3, 'O1'),(4, 1, 'O157'),(4, 2, 'Negative'),(4, 3, 'O2'),

(5, 1, 'Negative'),(5, 2, 'Negative'),(5, 3, 'Negative')

)a([Sample_ID], [Rep_ID], [Result]);

Which produces: -

Sample_ID Rep_ID Result

----------- ----------- --------

1 1 O152

1 2 O2

1 3 O157

2 1 O5

2 2 Negative

2 3 O1

3 1 O157

3 2 O1

3 3 O1

4 1 O157

4 2 Negative

4 3 O2

5 1 Negative

5 2 Negative

5 3 Negative

You'd like a result that looks like this: -

Sample_ID Profile

----------- ----------

1 O152;O157

2 Negative

3 O157

4 O157

5 Negative

??

If I'm correct in my assumptions above, then using the sample data that I defined above you could achieve the result like this: -

SELECT *

FROM (SELECT DISTINCT [Sample_ID]

FROM #testEnvironment

)a

CROSS APPLY (SELECT ISNULL(STUFF((SELECT ';'+[Result]

FROM #testEnvironment

WHERE a.Sample_ID = Sample_ID

AND CHARINDEX('O15',[Result]) > 0

FOR XML PATH(''), TYPE

).value('.','NVARCHAR(MAX)'),1,1,'')

,'Negative')

)b([Profile]);

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

RZ52 Mr or Mrs. 500 Points: 536 More actions · Answer 7

Dear Dadavre,

The proposed solution works good. However, we need also to distinguish those samples that are positive but no O15? factor was detected and tag them as "Non-pathogen" (in my dataset "Sample 2") and current command return it also as "Negative". How can I overcome this issue ?

Thanks again for your help

Cadavre SSC-Forever Points: 41690 More actions · Answer 8

RZ52 (2/28/2013)
Dear Dadavre,
The proposed solution works good. However, we need also to distinguish those samples that are positive but no O15? factor was detected and tag them as "Non-pathogen" (in my dataset "Sample 2") and current command return it also as "Negative". How can I overcome this issue ?
Thanks again for your help

~~Sorry, I missed "Non-pathogen". Does "Non-pathogen" only come up if all of the results are "Negative" ?~~

~~If so: -~~

~~SELECT [Sample_ID], ISNULL([Interim],[Profile]) AS [Profile]~~

~~FROM (SELECT [Sample_ID],~~

~~CASE WHEN MAX([Result]) = MIN([Result]) AND MAX([Result]) = 'Negative' THEN 'Non-pathogen' ELSE NULL END~~

~~FROM #testEnvironment~~

~~GROUP BY [Sample_ID]~~

~~)a([Sample_ID],[Interim])~~

~~CROSS APPLY (SELECT ISNULL(STUFF((SELECT ';'+[Result]~~

~~FROM #testEnvironment~~

~~WHERE a.Sample_ID = Sample_ID~~

~~AND CHARINDEX('O15',[Result]) > 0~~

~~FOR XML PATH(''), TYPE~~

~~).value('.','NVARCHAR(MAX)'),1,1,'')~~

~~,'Negative')~~

~~)b([Profile]);~~

~~Produces: -~~

~~Sample_ID Profile~~

~~----------- ------------~~

~~1 O152;O157~~

~~2 Negative~~

~~3 O157~~

~~4 O157~~

~~5 Non-pathogen~~

Re-read what you wrote, I got it the wrong way around.

Try this: -

SELECT [Sample_ID], ISNULL([Interim],[Profile]) AS [Profile]

FROM (SELECT [Sample_ID],

CAST(CASE WHEN MAX([Result]) = MIN([Result]) AND MAX([Result]) = 'Negative'

THEN 'Negative'

ELSE NULL END AS NVARCHAR(MAX))

FROM #testEnvironment

GROUP BY [Sample_ID]

)a([Sample_ID],[Interim])

CROSS APPLY (SELECT ISNULL(STUFF((SELECT ';'+[Result]

FROM #testEnvironment

WHERE a.Sample_ID = Sample_ID

AND CHARINDEX('O15',[Result]) > 0

FOR XML PATH(''), TYPE

).value('.','NVARCHAR(MAX)'),1,1,'')

,'Non-pathogen')

)b([Profile]);

Produces: -

Sample_ID Profile

----------- --------------

1 O152;O157

2 Non-pathogen

3 O157

4 O157

5 Negative

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Jeff Moden SSC Guru Points: 1003869 More actions · Answer 9

And now we know why I asked. 🙂 There's always more behind these types of requests. Thanks you very much for taking the time to post the business reason (filtered string aggregation to make human review/usage simpler is how I'm looking at it).

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RZ52 Mr or Mrs. 500 Points: 536 More actions · Answer 10

Dear Jeff,

Thanks for your advice. I am newbie so it might take time to learn how I need to present my situation but I always appreciate suggestions from seniors.

Best Regards

RZ52 Mr or Mrs. 500 Points: 536 More actions · Answer 11

Dear Cadavre,

It worked nice and perfect.

Sorry if I was not very clear on explaining situation.

Best Regards,

RZ52 Mr or Mrs. 500 Points: 536 More actions · Answer 12

Cadavre (2/28/2013)
So, to make sure I'm clear with what you want.
With this sample data: -
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment;
END;
SELECT [Sample_ID], [Rep_ID], [Result]
INTO #testEnvironment
FROM (VALUES(1, 1, 'O152'),(1, 2, 'O2'),(1, 3, 'O157'),(2, 1, 'O5'),
(2, 2, 'Negative'),(2, 3, 'O1'),(3, 1, 'O157'),(3, 2, 'O1'),
(3, 3, 'O1'),(4, 1, 'O157'),(4, 2, 'Negative'),(4, 3, 'O2'),
(5, 1, 'Negative'),(5, 2, 'Negative'),(5, 3, 'Negative')
)a([Sample_ID], [Rep_ID], [Result]);
Which produces: -
Sample_ID Rep_ID Result
----------- ----------- --------
1 1 O152
1 2 O2
1 3 O157
2 1 O5
2 2 Negative
2 3 O1
3 1 O157
3 2 O1
3 3 O1
4 1 O157
4 2 Negative
4 3 O2
5 1 Negative
5 2 Negative
5 3 Negative
You'd like a result that looks like this: -
Sample_ID Profile
----------- ----------
1 O152;O157
2 Negative
3 O157
4 O157
5 Negative
??
If I'm correct in my assumptions above, then using the sample data that I defined above you could achieve the result like this: -
SELECT *
FROM (SELECT DISTINCT [Sample_ID]
FROM #testEnvironment
)a
CROSS APPLY (SELECT ISNULL(STUFF((SELECT ';'+[Result]
FROM #testEnvironment
WHERE a.Sample_ID = Sample_ID
AND CHARINDEX('O15',[Result]) > 0
FOR XML PATH(''), TYPE
).value('.','NVARCHAR(MAX)'),1,1,'')
,'Negative')
)b([Profile]);

Dear Cadavre,

I am testing your solution and something came up that I can't solve. If I have two same result for a sample like :

IF object_id('tempdb..#testEnvironment') IS NOT NULL

BEGIN

DROP TABLE #testEnvironment;

END;

SELECT [Sample_ID], [Rep_ID], [Result]

INTO #testEnvironment

FROM (VALUES(1, 1, 'O152'),(1, 2, 'O2'),(1, 3, 'O157'),(1, 4, 'O157'),(2, 1, 'O5'),

(2, 2, 'Negative'),(2, 3, 'O1'),(3, 1, 'O157'),(3, 2, 'O1'),

(3, 3, 'O1'),(4, 1, 'O157'),(4, 2, 'Negative'),(4, 3, 'O2'),

(5, 1, 'Negative'),(5, 2, 'Negative'),(5, 3, 'Negative')

)a([Sample_ID], [Rep_ID], [Result]);

running your proposed code :

SELECT *

FROM (SELECT DISTINCT [Sample_ID]

FROM #testEnvironment

)a

CROSS APPLY (SELECT ISNULL(STUFF((SELECT ';'+[Result]

FROM #testEnvironment

WHERE a.Sample_ID = Sample_ID

AND CHARINDEX('O15',[Result]) > 0

FOR XML PATH(''), TYPE

).value('.','NVARCHAR(MAX)'),1,1,'')

,'Negative')

)b([Profile]);

will return :

Sample_ID Profile

----------- ----------

1 O152;O157;O157

2 Negative

3 O157

4 O157

5 Negative

How can I avoid the repeating item and have :

Sample_ID Profile

----------- ----------

1 O152;O157

2 Negative

3 O157

4 O157

5 Negative

Thanks in advance for help