Transpose/Pivot textual data

  • In our contract management system, each contract has over 100 reference fields attached to it. These are all stored in single table with contract ID, reference GUID and value as the columns.

    So you will have multiple rows for each contract....one for each of the reference fields and then the value attached to that reference.

    I want to return the data so there is one row per contract with the reference fields as columns and the reference field values as the column data.

    Can this be done using PIVOT as I have tried but not had any success?

    I will post an example of the data later on as I am currently on the train!

    Thanks in advance!

    Richard

  • Yes, it can be done using MAX() and probably a ROW_NUMBER() to prevent aggregation.

    If you post the sample data, I can show you how to do it.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Hi, thanks for the quick response.

    Here is the code to create the sample table.....

    --===== If the test table already exists, drop it

    IF OBJECT_ID('TempDB..#ctr_ref','U') IS NOT NULL

    DROP TABLE #ctr_ref

    --===== Create the test table with

    CREATE TABLE #ctr_ref

    (

    ContractIDINT, --Is a GUID column on real table

    ReferenceNVARCHAR(30), --Is a GUID on source table but will join to the References tbl to get label

    ValueNVARCHAR(100)

    )

    --===== Insert the test data into the test table

    INSERT INTO #ctr_ref

    Values

    ('1','Entity','123'),

    ('1','Analyst','John'),

    ('1','Location','UK'),

    ('1','ContractLength','12'),

    ('2','Entity','456'),

    ('2','Analyst','Mary'),

    ('2','Location','FR'),

    ('2','ContractLength','6'),

    ('3','Entity','123'),

    ('3','Analyst','John'),

    ('3','ContractLength','6')

    This is how I would like the data to be extracted:

    ContractIDEntityAnalystLocationContractLength

    =============================================

    1123JohnUK12

    2456MaryFR6

    3123JohnNULL6

    As shown above, the results should put a NULL in a row where the source data does not have a record for a specific reference.

  • As I said, using MAX() will help you with this problem. I hope that you can get rid of this EAV design which makes everything more complicated.

    Here's an example of the code needed.

    SELECT ContractID,

    MAX(CASE WHEN Reference = 'Entity' THEN Value END) Entity,

    MAX(CASE WHEN Reference = 'Analyst' THEN Value END) Analyst,

    MAX(CASE WHEN Reference = 'Location' THEN Value END) Location,

    MAX(CASE WHEN Reference = 'ContractLength' THEN Value END) ContractLength

    FROM #ctr_ref

    GROUP BY ContractID;

    And here's an article about the solution: http://www.sqlservercentral.com/articles/T-SQL/63681/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks Luis - that's very much appreciated.

    I will use Excel to generate the case statements as I have about 125 references!

    Unfortunately my hands are tied with the structure of the table.....it's a 3rd party application so I just have to work with what I've got!

  • You can use dynamic code to generate the statements.

    DECLARE @SQL varchar(MAX) ;

    SET @SQL = (SELECT DISTINCT ',MAX(CASE WHEN Reference = ''' + Reference + ''' THEN Value END) ' + Reference

    FROM #ctr_ref

    FOR XML PATH(''));

    --PRINT @SQL

    EXEC( 'SELECT ContractID ' + @SQL + ' FROM #ctr_ref GROUP BY ContractID');

    As you have many possible values, you might have some problems which can be solved by splitting the string in 8000 chars long strings.

    Another option is to use the PIVOT operator.

    --Static example

    SELECT ContractID,

    [Entity],

    [Analyst],

    [Location],

    [ContractLength]

    FROM (SELECT ContractID, Reference, Value FROM #ctr_ref) t

    PIVOT (MAX(Value) FOR Reference IN ([Entity],[Analyst],[Location],[ContractLength])) pvt;

    --Dynamic example

    DECLARE @Columns varchar(MAX);

    SET @Columns = STUFF((SELECT DISTINCT ',' + QUOTENAME(Reference)

    FROM #ctr_ref

    FOR XML PATH('')), 1, 1, '');

    --PRINT @Columns

    EXEC( 'SELECT ContractID ' + @Columns + ' FROM (SELECT ContractID, Reference, Value FROM #ctr_ref) t

    PIVOT (MAX(Value) FOR Reference IN (' + @Columns + ')) pvt;');

    For the dynamic code visit this article: http://www.sqlservercentral.com/articles/Crosstab/65048/

    For the concatenation code explanation visit this article: http://www.sqlservercentral.com/articles/comma+separated+list/71700/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Apologies for the delay in replying Luis.

    This is a great tip for the dynamic SQL. Thanks for taking the time to respond to my query.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply