How to flatten rows when in unknown for a PIVOT

  • Hi,

    I'm have a table with key and then an address, the key is listed multiple times, so for an AddressKey I can have 1 to 8 addresses. I'd like to return a structure I can join to that has one row for each AddressKey and then Address1, Address2, Address3, ... , Address8 Since the addresses are always different a PIVOT doesn't seem to be the answer, but then again maybe it is? Here is a script to create the test table:

    DROP TABLE [dbo].[Address];

    GO

    CREATE TABLE [dbo].[Address](

    [Address_pk] INT IDENTITY(1,1) NOT NULL,

    [AddressKey] INT NOT NULL,

    [Address] VARCHAR (200) NOT NULL,

    CONSTRAINT [PK_Address] PRIMARY KEY CLUSTERED

    (

    [Address_pk] ASC

    )

    );

    GO

    TRUNCATE TABLE [dbo].[Address];

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100368, '1756 2nd Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100368, '1758 2nd Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100368, '1760 2nd Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100378, '220 East 41st Street');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100378, '222 East 41st Street');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100388, '777-785 6th Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100388, '787-791 6th Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100390, '1470 2nd Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100390, '1466 2nd Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100390, '1468 2nd Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100395, '565 10th Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100395, '557 10th Avenue');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100460, '1201-1255 Kettner Boulevard');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (100460, '1200-1272 India Street');

    INSERT INTO [dbo].[Address] ([AddressKey], [Address]) VALUES (1147720, '550 Lenox Avenue');

    SELECT * FROM [dbo].[Address];

  • First, thank you for posting truly readily-consumable test data. You sure made it easy for me to concentrate on the answer.

    Here's the code to solve your problem. CROSSTABs are usually faster than PIVOTs and a bit easier to remember (for me, anyway).

    WITH cteEnumerate AS

    ( --=== Number each Address for each AddressKey

    SELECT [AddressKey]

    ,[Address]

    ,N = ROW_NUMBER() OVER (PARTITION BY [AddressKey] ORDER BY [Address])

    FROM [dbo].[Address]

    ) --=== Use a high performance CROSSTAB (ancient "Black Arts") to pivot the data.

    SELECT [AddressKey]

    ,[Address1] = MAX(CASE WHEN N = 1 THEN [Address] ELSE '' END)

    ,[Address2] = MAX(CASE WHEN N = 2 THEN [Address] ELSE '' END)

    ,[Address3] = MAX(CASE WHEN N = 3 THEN [Address] ELSE '' END)

    ,[Address4] = MAX(CASE WHEN N = 4 THEN [Address] ELSE '' END)

    ,[Address5] = MAX(CASE WHEN N = 5 THEN [Address] ELSE '' END)

    ,[Address6] = MAX(CASE WHEN N = 6 THEN [Address] ELSE '' END)

    ,[Address7] = MAX(CASE WHEN N = 7 THEN [Address] ELSE '' END)

    ,[Address8] = MAX(CASE WHEN N = 8 THEN [Address] ELSE '' END)

    FROM cteEnumerate

    GROUP BY [AddressKey]

    ORDER BY [AddressKey]

    ;

    For more information on how CROSSTABs work, please see the following two articles. They're a bit old but still spot on. The first article also has a performance test. The newer machines will make the performance differences between such CROSSTABs and PIVOTs a fair bit smaller but none of that will change what I consider to be other serious advantages of CROSSTABs over PIVOTs.

    [font="Arial Black"]Cross Tabs and Pivots, Part 1 – Converting Rows to Columns[/font][/url]

    [font="Arial Black"]Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs[/font][/url]

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ah... if you want a bit of a bonus column that identifies how many addresses there are for each AddressKey, we just need to add another aggregation.

    WITH cteEnumerate AS

    ( --=== Number each Address for each AddressKey

    SELECT [AddressKey]

    ,[Address]

    ,N = ROW_NUMBER() OVER (PARTITION BY [AddressKey] ORDER BY [Address])

    FROM [dbo].[Address]

    ) --=== Use a high performance CROSSTAB (ancient "Black Arts") to pivot the data.

    SELECT [AddressKey]

    ,[Address1] = MAX(CASE WHEN N = 1 THEN [Address] ELSE '' END)

    ,[Address2] = MAX(CASE WHEN N = 2 THEN [Address] ELSE '' END)

    ,[Address3] = MAX(CASE WHEN N = 3 THEN [Address] ELSE '' END)

    ,[Address4] = MAX(CASE WHEN N = 4 THEN [Address] ELSE '' END)

    ,[Address5] = MAX(CASE WHEN N = 5 THEN [Address] ELSE '' END)

    ,[Address6] = MAX(CASE WHEN N = 6 THEN [Address] ELSE '' END)

    ,[Address7] = MAX(CASE WHEN N = 7 THEN [Address] ELSE '' END)

    ,[Address8] = MAX(CASE WHEN N = 8 THEN [Address] ELSE '' END)

    ,[Number of Addresses] = COUNT(*) --<----- Added this

    FROM cteEnumerate

    GROUP BY [AddressKey]

    ORDER BY [AddressKey]

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • That was exactly what I was looking for, and bonus for the explanation, good reading material for this weekend!

  • Great. Thanks for the feedback, Jeff. I really appreciate it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply