Transposing repeating sets of rows into colums

Question

Post reply

Transposing repeating sets of rows into colums

wiernyjacek

SSC Journeyman

Points: 92
More actions
March 13, 2010 at 2:42 pm

#217825

I have loaded a text file into one colum:
Table: tmpNames
===================
[firstname = "John"]
[surname = "Scott"]
[age = "12"]
[firstname = "Barbara"]
[surname = "White"]
[age = "15"]
..................
[firstname = "Alex"]
[surname = "Ferguson"]
[age = "19"]
Any tips and tricks on how I can transpose it a table with three columns, so I get
Table: myPeople
==============
firstname surname age
-----------------------------
John Scott 12
Barbara White 15
Alex Ferguson 19
Any help will be much appreciated!

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

LutzM SSC Guru Points: 107049 More actions · Answer 1

Ok, here's what I came up with. I'm not sure if it'll help but it might be at least something to start with.

CREATE TABLE #temp

(id INT IDENTITY(1,1),

col VARCHAR(60)

)

INSERT INTO #temp

SELECT 'Table: tmpNames' UNION ALL

SELECT '===================' UNION ALL

SELECT '[firstname = "John"]' UNION ALL

SELECT '[surname = "Scott"]' UNION ALL

SELECT '[age = "12"]' UNION ALL

SELECT '[firstname = "Barbara"]' UNION ALL

SELECT '[surname = "White"]' UNION ALL

SELECT '[age = "15"]' UNION ALL

SELECT '..................' UNION ALL

SELECT '[firstname = "Alex"]' UNION ALL

SELECT '[surname = "Ferguson"]' UNION ALL

SELECT '[age = "19"]'

;WITH cte AS -- get the row number for each first name as a group separator

(

SELECT ROW_NUMBER() OVER(ORDER BY id) AS ROW,

*

FROM #temp

WHERE col LIKE '%firstname%'

),

cte2 AS -- find the ranges per group

(

SELECT

cte1.row,

cte1.id AS BeginSec,

CASE WHEN cte2.id IS NULL THEN (SELECT MAX(id) FROM #temp)+1 ELSE cte2.id END AS EndSec

FROM cte cte1

LEFT OUTER JOIN cte cte2

ON cte1.row = cte2.row-1

),

cte3 AS -- eliminate brackets and separate the value per column excluding quotation marks

(

SELECT cte2.row,REPLACE(REPLACE(col,'[',''),']','') AS sub,

REPLACE(SUBSTRING(col,CHARINDEX('=',col)+1,LEN(col)-(CHARINDEX('=',col)+1)),'"','') AS val

FROM cte2

INNER JOIN #temp

ON #temp.id>=BeginSec

AND #temp.id<EndSec

WHERE col LIKE '/[%' ESCAPE '/'

)

-- final output using CrossTabs

SELECT

MAX(CASE WHEN sub LIKE 'firstname%' THEN val ELSE '' END) AS firstname,

MAX(CASE WHEN sub LIKE 'surname%' THEN val ELSE '' END) AS surname,

MAX(CASE WHEN sub LIKE 'age%' THEN val ELSE '' END) AS age

FROM cte3

GROUP BY ROW

/* result set

firstnamesurnameage

John Scott 12

Barbara White 15

Alex Ferguson 19

*/

Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question[/url]
How to post performance related questions[/url]
Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

wiernyjacek SSC Journeyman Points: 92 More actions · Answer 2

wiernyjacek

SSC Journeyman

Points: 92

March 14, 2010 at 9:34 am

#1133396

Thanks, Looks like it really does the job.

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 3

wiernyjacek (3/13/2010)
I have loaded a text file into one colum:
Table: tmpNames
===================
[firstname = "John"]
[surname = "Scott"]
[age = "12"]
[firstname = "Barbara"]
[surname = "White"]
[age = "15"]
..................
[firstname = "Alex"]
[surname = "Ferguson"]
[age = "19"]
Any tips and tricks on how I can transpose it a table with three columns, so I get
Table: myPeople
==============
firstname surname age
-----------------------------
John Scott 12
Barbara White 15
Alex Ferguson 19
Any help will be much appreciated!

Ummm... what do the first 4 lines of the text file actually look like? I ask because I believe you may be making it hard on yourself. Please remember not to post any private information.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

wiernyjacek SSC Journeyman Points: 92 More actions · Answer 4

These are fake names, so no private information is sent here. However I analysed the file again and it looks like the age is not always given, so the file may look like this.

I have loaded a text file into one colum:

[firstname = "John"]

[surname = "Scott"]

[age = "12"]

[firstname = "Barbara"]

[surname = "White"]

[firstname = "Alex"]

[surname = "Ferguson"]

[age = "19"]

In such case the age should be inserted as NULL.

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 5

wiernyjacek (3/15/2010)
These are fake names, so no private information is sent here. However I analysed the file again and it looks like the age is not always given, so the file may look like this.
I have loaded a text file into one colum:
[firstname = "John"]
[surname = "Scott"]
[age = "12"]
[firstname = "Barbara"]
[surname = "White"]
[firstname = "Alex"]
[surname = "Ferguson"]
[age = "19"]
In such case the age should be inserted as NULL.

That doesn't help me help you... I think you may be loading the text file incorrectly. Please post the first 4 lines of the file so I can help here.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

steve-893342 SSCarpal Tunnel Points: 4439 More actions · Answer 6

Out of interest, I have come up with another variation which produces the same result as Lutz

IF NOT OBJECT_ID ('tempdb.dbo.#FOO', 'U') IS NULL

DROP TABLE #FOO

SELECT 'Table: tmpNames' AS BulkColumn INTO #FOO UNION ALL

SELECT '===================' UNION ALL

SELECT '[firstname = "John"]' UNION ALL

SELECT '[surname = "Scott"]' UNION ALL

SELECT '[age = "12"]' UNION ALL

SELECT '[firstname = "Barbara"]' UNION ALL

SELECT '[surname = "White"]' UNION ALL

SELECT '[age = "15"]' UNION ALL

SELECT '..................' UNION ALL

SELECT '[firstname = "Alex"]' UNION ALL

SELECT '[surname = "Ferguson"]' UNION ALL

SELECT '[age = "19"]'

;

WITH cteTally

AS

(

SELECT TOP 4000 ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS pk

FROM master.sys.All_Columns

)

,

cte2 AS

(

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

* FROM #FOO

CROSS APPLY

(

SELECT

REPLACE([1], '[', '') AS COL1, REPLACE(REPLACE([2], '"', ''), ']', '') AS COL2

FROM

(

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

NULLIF(SUBSTRING(BulkColumn + '=', pk, CHARINDEX('=', BulkColumn + '=', pk) - pk), '') AS Value

FROM cteTally

WHERE pk < LEN(BulkColumn) + 2 AND SUBSTRING('=' + BulkColumn + '=', pk, 1) = '='

) AS Z

PIVOT

(

MAX(Value) FOR ROW IN

(

[1],[2]

)

AS pvt

)

AS Y

WHERE LEFT(BulkColumn, 4) IN ('[fir', '[sur', '[age')

)

,

cte3 AS

(

SELECT

ROW - ((ROW - 1) % 3) AS block,

* FROM cte2

)

SELECT firstname, surname, age

FROM

(

SELECT block, COL1, COL2

FROM cte3

) AS pvt

PIVOT

(

MAX(COL2) FOR COL1 IN (firstname, surname, age)

) AS Y

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 7

The problem is that there's nothing in the bulk table to guarantee the order and I don't trust an unindexed heap to do so.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

steve-893342 SSCarpal Tunnel Points: 4439 More actions · Answer 8

OK then, say we don't even risk putting the file into a table and run the CTE directly on the file using OPENROWSET BULK. Then in cte3 we can specifically set the ORDER BY clause. Seems to work OK.

WITH cteFOO

AS

(

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

BulkColumn FROM OPENROWSET (BULK 'C:\Transpose.txt', FORMATFILE = 'C:\BulkFile.fmt') AS Z

WHERE LEFT(BulkColumn, 4) IN ('[fir', '[sur', '[age')

)

,

cteTally

AS

(

SELECT TOP 4000 ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS pk

FROM master.sys.All_Columns

)

,

cte2 AS

(

SELECT

ROW, BulkColumn, COL1, COL2

FROM cteFOO

CROSS APPLY

(

SELECT

REPLACE([1], '[', '') AS COL1, REPLACE(REPLACE([2], '"', ''), ']', '') AS COL2

FROM

(

SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,

NULLIF(SUBSTRING(BulkColumn + '=', pk, CHARINDEX('=', BulkColumn + '=', pk) - pk), '') AS Value

FROM cteTally

WHERE pk < LEN(BulkColumn) + 2 AND SUBSTRING('=' + BulkColumn + '=', pk, 1) = '='

) AS Z

PIVOT

(

MAX(Value) FOR ROW IN

(

[1],[2]

)

AS pvt

)

AS Y

)

,

cte3 AS

(

SELECT TOP 100 PERCENT

ROW - ((ROW - 1) % 3) AS block,

* FROM cte2 ORDER BY ROW

)

SELECT firstname, surname, age

FROM

(

SELECT block, COL1, COL2

FROM cte3

) AS pvt

PIVOT

(

MAX(COL2) FOR COL1 IN (firstname, surname, age)

) AS Y