Transposing repeating sets of rows into colums

  • I have loaded a text file into one colum:

    Table: tmpNames

    ===================

    [firstname = "John"]

    [surname = "Scott"]

    [age = "12"]

    [firstname = "Barbara"]

    [surname = "White"]

    [age = "15"]

    ..................

    [firstname = "Alex"]

    [surname = "Ferguson"]

    [age = "19"]

    Any tips and tricks on how I can transpose it a table with three columns, so I get

    Table: myPeople

    ==============

    firstname surname age

    -----------------------------

    John Scott 12

    Barbara White 15

    Alex Ferguson 19

    Any help will be much appreciated!

  • Ok, here's what I came up with. I'm not sure if it'll help but it might be at least something to start with.

    CREATE TABLE #temp

    (id INT IDENTITY(1,1),

    col VARCHAR(60)

    )

    INSERT INTO #temp

    SELECT 'Table: tmpNames' UNION ALL

    SELECT '===================' UNION ALL

    SELECT '[firstname = "John"]' UNION ALL

    SELECT '[surname = "Scott"]' UNION ALL

    SELECT '[age = "12"]' UNION ALL

    SELECT '[firstname = "Barbara"]' UNION ALL

    SELECT '[surname = "White"]' UNION ALL

    SELECT '[age = "15"]' UNION ALL

    SELECT '..................' UNION ALL

    SELECT '[firstname = "Alex"]' UNION ALL

    SELECT '[surname = "Ferguson"]' UNION ALL

    SELECT '[age = "19"]'

    ;WITH cte AS -- get the row number for each first name as a group separator

    (

    SELECT ROW_NUMBER() OVER(ORDER BY id) AS ROW,

    *

    FROM #temp

    WHERE col LIKE '%firstname%'

    ),

    cte2 AS -- find the ranges per group

    (

    SELECT

    cte1.row,

    cte1.id AS BeginSec,

    CASE WHEN cte2.id IS NULL THEN (SELECT MAX(id) FROM #temp)+1 ELSE cte2.id END AS EndSec

    FROM cte cte1

    LEFT OUTER JOIN cte cte2

    ON cte1.row = cte2.row-1

    ),

    cte3 AS -- eliminate brackets and separate the value per column excluding quotation marks

    (

    SELECT cte2.row,REPLACE(REPLACE(col,'[',''),']','') AS sub,

    REPLACE(SUBSTRING(col,CHARINDEX('=',col)+1,LEN(col)-(CHARINDEX('=',col)+1)),'"','') AS val

    FROM cte2

    INNER JOIN #temp

    ON #temp.id>=BeginSec

    AND #temp.id<EndSec

    WHERE col LIKE '/[%' ESCAPE '/'

    )

    -- final output using CrossTabs

    SELECT

    MAX(CASE WHEN sub LIKE 'firstname%' THEN val ELSE '' END) AS firstname,

    MAX(CASE WHEN sub LIKE 'surname%' THEN val ELSE '' END) AS surname,

    MAX(CASE WHEN sub LIKE 'age%' THEN val ELSE '' END) AS age

    FROM cte3

    GROUP BY ROW

    /* result set

    firstnamesurnameage

    John Scott 12

    Barbara White 15

    Alex Ferguson 19

    */



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Thanks, Looks like it really does the job.

  • wiernyjacek (3/13/2010)


    I have loaded a text file into one colum:

    Table: tmpNames

    ===================

    [firstname = "John"]

    [surname = "Scott"]

    [age = "12"]

    [firstname = "Barbara"]

    [surname = "White"]

    [age = "15"]

    ..................

    [firstname = "Alex"]

    [surname = "Ferguson"]

    [age = "19"]

    Any tips and tricks on how I can transpose it a table with three columns, so I get

    Table: myPeople

    ==============

    firstname surname age

    -----------------------------

    John Scott 12

    Barbara White 15

    Alex Ferguson 19

    Any help will be much appreciated!

    Ummm... what do the first 4 lines of the text file actually look like? I ask because I believe you may be making it hard on yourself. Please remember not to post any private information.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • These are fake names, so no private information is sent here. However I analysed the file again and it looks like the age is not always given, so the file may look like this.

    I have loaded a text file into one colum:

    [firstname = "John"]

    [surname = "Scott"]

    [age = "12"]

    [firstname = "Barbara"]

    [surname = "White"]

    [firstname = "Alex"]

    [surname = "Ferguson"]

    [age = "19"]

    In such case the age should be inserted as NULL.

  • wiernyjacek (3/15/2010)


    These are fake names, so no private information is sent here. However I analysed the file again and it looks like the age is not always given, so the file may look like this.

    I have loaded a text file into one colum:

    [firstname = "John"]

    [surname = "Scott"]

    [age = "12"]

    [firstname = "Barbara"]

    [surname = "White"]

    [firstname = "Alex"]

    [surname = "Ferguson"]

    [age = "19"]

    In such case the age should be inserted as NULL.

    That doesn't help me help you... I think you may be loading the text file incorrectly. Please post the first 4 lines of the file so I can help here.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Out of interest, I have come up with another variation which produces the same result as Lutz

    IF NOT OBJECT_ID ('tempdb.dbo.#FOO', 'U') IS NULL

    DROP TABLE #FOO

    SELECT 'Table: tmpNames' AS BulkColumn INTO #FOO UNION ALL

    SELECT '===================' UNION ALL

    SELECT '[firstname = "John"]' UNION ALL

    SELECT '[surname = "Scott"]' UNION ALL

    SELECT '[age = "12"]' UNION ALL

    SELECT '[firstname = "Barbara"]' UNION ALL

    SELECT '[surname = "White"]' UNION ALL

    SELECT '[age = "15"]' UNION ALL

    SELECT '..................' UNION ALL

    SELECT '[firstname = "Alex"]' UNION ALL

    SELECT '[surname = "Ferguson"]' UNION ALL

    SELECT '[age = "19"]'

    ;

    WITH cteTally

    AS

    (

    SELECT TOP 4000 ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS pk

    FROM master.sys.All_Columns

    )

    ,

    cte2 AS

    (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

    * FROM #FOO

    CROSS APPLY

    (

    SELECT

    REPLACE([1], '[', '') AS COL1, REPLACE(REPLACE([2], '"', ''), ']', '') AS COL2

    FROM

    (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

    NULLIF(SUBSTRING(BulkColumn + '=', pk, CHARINDEX('=', BulkColumn + '=', pk) - pk), '') AS Value

    FROM cteTally

    WHERE pk < LEN(BulkColumn) + 2 AND SUBSTRING('=' + BulkColumn + '=', pk, 1) = '='

    ) AS Z

    PIVOT

    (

    MAX(Value) FOR ROW IN

    (

    [1],[2]

    )

    )

    AS pvt

    )

    AS Y

    WHERE LEFT(BulkColumn, 4) IN ('[fir', '[sur', '[age')

    )

    ,

    cte3 AS

    (

    SELECT

    ROW - ((ROW - 1) % 3) AS block,

    * FROM cte2

    )

    SELECT firstname, surname, age

    FROM

    (

    SELECT block, COL1, COL2

    FROM cte3

    ) AS pvt

    PIVOT

    (

    MAX(COL2) FOR COL1 IN (firstname, surname, age)

    ) AS Y

  • The problem is that there's nothing in the bulk table to guarantee the order and I don't trust an unindexed heap to do so.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • OK then, say we don't even risk putting the file into a table and run the CTE directly on the file using OPENROWSET BULK. Then in cte3 we can specifically set the ORDER BY clause. Seems to work OK.

    WITH cteFOO

    AS

    (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

    BulkColumn FROM OPENROWSET (BULK 'C:\Transpose.txt', FORMATFILE = 'C:\BulkFile.fmt') AS Z

    WHERE LEFT(BulkColumn, 4) IN ('[fir', '[sur', '[age')

    )

    ,

    cteTally

    AS

    (

    SELECT TOP 4000 ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS pk

    FROM master.sys.All_Columns

    )

    ,

    cte2 AS

    (

    SELECT

    ROW, BulkColumn, COL1, COL2

    FROM cteFOO

    CROSS APPLY

    (

    SELECT

    REPLACE([1], '[', '') AS COL1, REPLACE(REPLACE([2], '"', ''), ']', '') AS COL2

    FROM

    (

    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

    NULLIF(SUBSTRING(BulkColumn + '=', pk, CHARINDEX('=', BulkColumn + '=', pk) - pk), '') AS Value

    FROM cteTally

    WHERE pk < LEN(BulkColumn) + 2 AND SUBSTRING('=' + BulkColumn + '=', pk, 1) = '='

    ) AS Z

    PIVOT

    (

    MAX(Value) FOR ROW IN

    (

    [1],[2]

    )

    )

    AS pvt

    )

    AS Y

    )

    ,

    cte3 AS

    (

    SELECT TOP 100 PERCENT

    ROW - ((ROW - 1) % 3) AS block,

    * FROM cte2 ORDER BY ROW

    )

    SELECT firstname, surname, age

    FROM

    (

    SELECT block, COL1, COL2

    FROM cte3

    ) AS pvt

    PIVOT

    (

    MAX(COL2) FOR COL1 IN (firstname, surname, age)

    ) AS Y

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply