Combine Multiple rows into single column

  • Hi all,

    I have a table which has multiple rows for the same product:

    Prod1, Description

    Prod1, Description

    Prod2, Description

    Prod3, Description

    Prod3, Description

    Prod3, Description

    Prod3, Description

    I need to combine the description for each product into a table that has the product only once plus its description.

    Prod1, Description

    Prod2, Description

    Prod3, Description

    The description in table2 will concatenate all the descriptions that appear in the first table.

    Anybody have a solution to this in TSQL please?

    Thanks

    Stuart

  • This should work:

    ;With cteTable1 as (

    Select Row_Number() OVER(Partition by Prod Order by 1) as Row

    , Count(*) OVER(Partition by Prod) as Cnt

    , Prod

    , Description

    From table1

    ),

    cteIterate as (

    Select Row, Cnt, Prod, Description, Description as AllDescr

    From cteTable1

    Where Row = 1

    UNION ALL

    Select t.Row, t.Cnt, t.Prod, t.Description

    , i.AllDescr + t.Description as AllDescr

    From cteTable1 t

    Join cteIterate i ON t.Row-1 = i.Row

    Where t.Row > 1

    )

    Select Prod, AllDescr

    From cteIterate

    Where Row=Cnt

    I am sure there must be a better way, but I am drawing a blank right now.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks for your time replying!

    I have tried the query but get the following error:

    --------------------------------

    Msg 240, Level 16, State 1, Line 1

    Types don't match between the anchor and the recursive part in column "AllDescr" of recursive query "cteIterate".

    --------------------------------

    The query as it is now looks like this:

    ;With cteTable1 as (

    Select Row_Number() OVER(Partition by NorthamberPartNumber Order by (SELECT 1)) as Row

    , Count(*) OVER(Partition by NorthamberPartNumber) as Cnt

    , NorthamberPartNumber

    , Description

    From xNorthamberTechInfo

    ),

    cteIterate as (

    Select Row, Cnt, NorthamberPartNumber, Description, Description as AllDescr

    From cteTable1

    Where Row = 1

    UNION ALL

    Select t.Row, t.Cnt, t.NorthamberPartNumber, t.Description

    , i.AllDescr + t.Description as AllDescr

    From cteTable1 t

    Join cteIterate i ON t.Row-1 = i.Row

    Where t.Row > 1

    )

    Select NorthamberPartNumber, AllDescr

    From cteIterate

    Where Row=Cnt

    Any ideas?

    Thanks!

  • if object_id('fn_list') is not null drop function fn_list;

    go

    create function sp_list

    (

    @prod varchar(100)

    )

    returns varchar(8000)

    as

    begin

    declare @list varchar(8000)

    select @list = isnull(@list ,'') + case when isnull(@list ,'') <> '' then ' ' else '' end + isnull(Description,'')

    from Table

    where Prod = @prod

    return @list

    end;

    go

    if object_id('New_Table') is not null drop table New_Table;

    go

    select prod, fn_list(prod) [Description]

    into New_Table

    from Table

    group by prod;

    go

    Hope that works. I think a cursur would work for something like that as well.

  • stuart.hill (8/4/2008)


    --------------------------------

    Msg 240, Level 16, State 1, Line 1

    Types don't match between the anchor and the recursive part in column "AllDescr" of recursive query "cteIterate".

    --------------------------------

    There's not much that I can do about that since you haven't given us any table or column definitions.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • There may be a better way (ie without the subquery, set based) but here is something that at least works. If I get any more brilliant ideas, I'll post back. Here is some code to work with for now.

    create table #TestTable (

    ProdID varchar(5),

    ProdDesc varchar(256)

    );

    insert into #TestTable (

    ProdID,

    ProdDesc

    )

    select 'Prod1', 'Description 1' union all

    select 'Prod1', 'Description 2' union all

    select 'Prod2', 'Description 1' union all

    select 'Prod3', 'Description 1' union all

    select 'Prod3', 'Description 2' union all

    select 'Prod3', 'Description 3' union all

    select 'Prod3', 'Description 4';

    select distinct

    a.ProdID,

    stuff((select

    ',' + b.ProdDesc

    from

    #TestTable b

    where

    b.ProdID = a.ProdID

    for xml path('')),1,1,'') as ProdDesc

    from

    #TestTable a;

    drop table #TestTable;

    😎

  • Hi all, and thank you all for your time! I actually thought my SQL was pretty good until I hit this problem. So, some more information as requested. The product descriptions have been sent to me by a supplier in an Excel file. The file has 380,000 rows! Here's the first bit of it...

    "PRODNO01"|"SUPPLIER PART NO"|"COM DTL"

    "0231A320 "|"0231A320 "|"Function/Use : "

    "0231A320 "|"0231A320 "|"Warranty : "

    "0231A320 "|"0231A320 "|"Notes/Features : "

    "0231A320 "|"0231A320 "|" "

    "0231A49M "|"0231A49M "|"Function/Use : Ports: One RJ-11 Analog Modem Port "

    "0231A49M "|"0231A49M "|" "

    "0231A49M "|"0231A49M "|" Speed: Maximum 56Kbps "

    "0231A49M "|"0231A49M "|" "

    "0231A49M "|"0231A49M "|" Standards: ITU-T V.90, V.34 (33.6 "

    "0231A49M "|"0231A49M "|" kbps), V.FC, V.32 bis, V.32, V.22 bis,"

    "0231A49M "|"0231A49M "|" V.22A/B, V.23, V.21, Bell 212A a, Bell"

    "0231A49M "|"0231A49M "|" 103 "

    "0231A49M "|"0231A49M "|" "

    "0231A49M "|"0231A49M "|" Package Contents "

    "0231A49M "|"0231A49M "|" analog modem interface card "

    "0231A49M "|"0231A49M "|" "

    "0231A49M "|"0231A49M "|" "

    "0231A49M "|"0231A49M "|" "

    "0231A49M "|"0231A49M "|"Warranty : "

    "0231A49M "|"0231A49M "|"Notes/Features : "

    "0231A49M "|"0231A49M "|" "

    "0231A50Y "|"0231A50Y "|"Function/Use : Ports: 1 "

    "0231A50Y "|"0231A50Y "|" "

    "0231A50Y "|"0231A50Y "|" Connector: 1 x RJ-11 "

    "0231A50Y "|"0231A50Y "|" "

    "0231A50Y "|"0231A50Y "|" Interface service: ADSL and ADSL2+ "

    "0231A50Y "|"0231A50Y "|" over regular telephone line "

    "0231A50Y "|"0231A50Y "|" "

    "0231A50Y "|"0231A50Y "|" Interface standard: ITU-T 992.1 G.DMT,"

    "0231A50Y "|"0231A50Y "|" ITU-T 992.2 G.Lite, ANSI T1.413 Issue "

    "0231A50Y "|"0231A50Y "|" 2, ITU-T 992.3, ITU-T 992.5 "

    "0231A50Y "|"0231A50Y "|" "

    "0231A50Y "|"0231A50Y "|" Maximum downstream rate: 24 Mbps "

    "0231A50Y "|"0231A50Y "|" "

    "0231A50Y "|"0231A50Y "|" Maximum upstream rate: 1 Mbps "

    "0231A50Y "|"0231A50Y "|" "

    "0231A50Y "|"0231A50Y "|" Package Contents "

    "0231A50Y "|"0231A50Y "|" Interface card "

    "0231A50Y "|"0231A50Y "|"Warranty : "

    "0231A50Y "|"0231A50Y "|"Notes/Features : "

    As you can see it is very well formed(!), and no, they won't send it to me any other way! The supplier column (2nd column) can be ignored.

    I have put together an Integration Services package to get this done automatically.

    As suggested, I could[/i] use a cursor, but that is going to be slow, slow, slow on this many records! What I am trying to achieve (Well, hoping you'll help me achieve!) is a sweet piece of SQL that will do the concatenation in a few seconds/minutes. Barry's script I have now got running, but I stopped it after 30 minutes. (Script appended btw).

    The input columns can be any data type, as first I load a temp table in SSIS. Then I want to concat the description's into another table, so DTs are flexible!

    Does this help more? I've pulled out most of my hair trying to figure a way forward, so all your help so far is appreciated!

    Thanks,

    Stuart

    --------- RBYs script now working but too slow for production -----------

    ;With cteTable1 as (

    Select Row_Number() OVER(Partition by NorthamberPartNumber Order by (SELECT 1)) as Row

    , Count(*) OVER(Partition by NorthamberPartNumber) as Cnt

    , NorthamberPartNumber

    , [Description]

    From xNorthamberTechInfo

    ),

    cteIterate as (

    Select Row, Cnt, NorthamberPartNumber, Description, CAST(Description as nvarchar(4000)) as AllDescr

    From cteTable1

    Where Row = 1

    UNION ALL

    Select t.Row, t.Cnt, t.NorthamberPartNumber, t.Description

    , i.AllDescr + t.Description as AllDescr

    From cteTable1 t

    Join cteIterate i ON t.Row-1 = i.Row

    Where t.Row > 1

    )

    Select NorthamberPartNumber, AllDescr

    From cteIterate

    Where Row=Cnt

    ------------------- And how do I stop those smileys from appearing in the post?

  • The smiley's appear when you have a space followed by a ")" followed by a space or an end-line. I try to put a tab or a character in front of my ")"'s to prevent it.

    Try Lynn's solution first, before I try to fix my solution. (No point if Lynn's works!)

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Thanks for the quick response Barry.

    I don't think I can make Lynn's solution work. The file has 12,500 distinct products.

    Unless I am missing something that would make the SELECT/UNION ALL statements into the temporary table unworkable.

    ?

  • D'oh!

    Ignore last post - I have just read it

  • The fastest way I know to do this is with a "running totals" solution:

    create table #Descriptions (

    ID int identity primary key nonclustered,

    ProdNo varchar(50),

    [Description] varchar(50),

    ConcatDescription varchar(max))

    create clustered index CID_ProdNo on #Descriptions (prodno)

    insert into #Descriptions (ProdNo, [Description])

    select 'Product1', 'Description1' union all

    select 'Product1', 'Description2' union all

    select 'Product2', 'Description3'

    declare @Dscrpt varchar(max), @ProdNo varchar(50)

    update #descriptions

    set @dscrpt = concatdescription =

    case

    when @ProdNo = ProdNo then coalesce(@dscrpt + ';' + [description], [description])

    else [description]

    end,

    @ProdNo = ProdNo

    ;with Concats (ProdNo, ConcatDescription, Row) as

    (select ProdNo, ConcatDescription,

    row_number() over (partition by prodno order by len(concatdescription) desc)

    from #descriptions)

    select *

    from concats

    where row = 1

    I did a speed test on this, with 1-million rows of data:

    set nocount on

    create table #Descriptions (

    ID int identity primary key nonclustered,

    ProdNo varchar(50),

    [Description] varchar(50),

    ConcatDescription varchar(max))

    create clustered index CID_ProdNo on #Descriptions (prodno)

    insert into #Descriptions (ProdNo, [Description])

    select 'Product' + cast(n1.number as varchar(10)), n1.number

    from dbo.Numbers n1

    cross join dbo.Numbers n2

    where n1.number between 1 and 1000

    and n2.number between 1 and 1000

    declare @Dscrpt varchar(max), @ProdNo varchar(50)

    set statistics time on

    update #descriptions

    set @dscrpt = concatdescription =

    case

    when @ProdNo = ProdNo then coalesce(@dscrpt + ';' + [description], [description])

    else [description]

    end,

    @ProdNo = ProdNo

    /*

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 4 ms.

    SQL Server Execution Times:

    CPU time = 136734 ms, elapsed time = 682776 ms.

    */

    ;with Concats (ProdNo, ConcatDescription, Row) as

    (select ProdNo, ConcatDescription,

    row_number() over (partition by prodno order by len(concatdescription) desc)

    from #descriptions)

    delete from concats

    where row > 1

    /*

    SQL Server parse and compile time:

    CPU time = 0 ms, elapsed time = 8 ms.

    SQL Server Execution Times:

    CPU time = 106639 ms, elapsed time = 1168282 ms.

    */

    select top 100 *

    from #descriptions

    Simple-Talk has an article on this subject:

    http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/

    Per the author, this method is unreliable, but I've not yet seen it fail, nor seen any proof of failure. I think his dislike of it is because it's undocumented. On the same site, Robyn Page and Phil Factor have used this method in their workshops.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Hey...Lynn! you are amazing! your solution helped me to save hours on the problem. Thanks a lot!

  • RBarryYoung (8/4/2008)


    stuart.hill (8/4/2008)


    --------------------------------

    Msg 240, Level 16, State 1, Line 1

    Types don't match between the anchor and the recursive part in column "AllDescr" of recursive query "cteIterate".

    --------------------------------

    There's not much that I can do about that since you haven't given us any table or column definitions.

    You might need to change the dataa types to be the same and another thing that you should also check is if the collation. If the collation of the database is not the same as the one of the serve then use the collate database_default on the column that is mentioned as part of the error message.

    What you don't know won't hurt you but what you know will make you plan to know better
  • Lynn Pettis (8/4/2008)


    There may be a better way (ie without the subquery, set based) but here is something that at least works. If I get any more brilliant ideas, I'll post back. Here is some code to work with for now.

    create table #TestTable (

    ProdID varchar(5),

    ProdDesc varchar(256)

    );

    insert into #TestTable (

    ProdID,

    ProdDesc

    )

    select 'Prod1', 'Description 1' union all

    select 'Prod1', 'Description 2' union all

    select 'Prod2', 'Description 1' union all

    select 'Prod3', 'Description 1' union all

    select 'Prod3', 'Description 2' union all

    select 'Prod3', 'Description 3' union all

    select 'Prod3', 'Description 4';

    select distinct

    a.ProdID,

    stuff((select

    ',' + b.ProdDesc

    from

    #TestTable b

    where

    b.ProdID = a.ProdID

    for xml path('')),1,1,'') as ProdDesc

    from

    #TestTable a;

    drop table #TestTable;

    😎

    Just something to be aware of, if the descriptions contain any special XML characters such as ampersand, the query will escape them. This can be fixed as below.

    create table #TestTable (

    ProdID varchar(5),

    ProdDesc varchar(256)

    );

    insert into #TestTable (

    ProdID,

    ProdDesc

    )

    select 'Prod1', 'Description 1' union all

    select 'Prod1', 'Description 2' union all

    select 'Prod2', 'Description 1' union all

    select 'Prod3', 'Description 1' union all

    select 'Prod3', 'Description 2' union all

    select 'Prod3', 'Description 3' union all

    select 'Prod3', 'Description 4' union all

    select 'Prod3', 'Description 5 & 6';

    select distinct

    a.ProdID,

    stuff((select

    ',' + b.ProdDesc

    from

    #TestTable b

    where

    b.ProdID = a.ProdID

    for xml path(''),type).value('.[1]','varchar(max)'),1,1,'') as ProdDesc

    from

    #TestTable a;

    drop table #TestTable;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • How does this approach work for you:

    --first a temp table and data to test with:

    Create Table #Products (ID int, Description varchar(30))

    INSERT INTO #Products (ID, Description)values (1, 'Hairy')

    INSERT INTO #Products (ID, Description)values (1, 'Fat')

    INSERT INTO #Products (ID, Description)values (1, 'Dog')

    INSERT INTO #Products (ID, Description)values (2, 'Cross-eyed')

    INSERT INTO #Products (ID, Description)values (2, 'Cat')

    INSERT INTO #Products (ID, Description)values (3, 'Elephant')

    INSERT INTO #Products (ID, Description)values (4, 'Scary')

    INSERT INTO #Products (ID, Description)values (4, 'Green')

    INSERT INTO #Products (ID, Description)values (4, 'Poisonous')

    INSERT INTO #Products (ID, Description)values (4, 'Frog')

    --Use the Cross Apply and For XML construct to rotatethe table

    SELECT DISTINCT ID, C.Description

    FROM #Products

    CROSS APPLY

    (

    SELECT [Description] + ' '

    FROM #Products P

    WHERE P.ID=#Products.ID

    FOR XML PATH('')

    ) C(Description)

    DROP TABLE #Products

    You should see this after running:

    ID Description

    ----------- ----------

    1 Hairy Fat Dog

    2 Cross-eyed Cat

    3 Elephant

    4 Scary Green Poisonous Frog

    (4 row(s) affected)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply