March 13, 2010 at 2:42 pm
I have loaded a text file into one colum:
Table: tmpNames
===================
[firstname = "John"]
[surname = "Scott"]
[age = "12"]
[firstname = "Barbara"]
[surname = "White"]
[age = "15"]
..................
[firstname = "Alex"]
[surname = "Ferguson"]
[age = "19"]
Any tips and tricks on how I can transpose it a table with three columns, so I get
Table: myPeople
==============
firstname surname age
-----------------------------
John Scott 12
Barbara White 15
Alex Ferguson 19
Any help will be much appreciated!
March 13, 2010 at 5:33 pm
Ok, here's what I came up with. I'm not sure if it'll help but it might be at least something to start with.
CREATE TABLE #temp
(id INT IDENTITY(1,1),
col VARCHAR(60)
)
INSERT INTO #temp
SELECT 'Table: tmpNames' UNION ALL
SELECT '===================' UNION ALL
SELECT '[firstname = "John"]' UNION ALL
SELECT '[surname = "Scott"]' UNION ALL
SELECT '[age = "12"]' UNION ALL
SELECT '[firstname = "Barbara"]' UNION ALL
SELECT '[surname = "White"]' UNION ALL
SELECT '[age = "15"]' UNION ALL
SELECT '..................' UNION ALL
SELECT '[firstname = "Alex"]' UNION ALL
SELECT '[surname = "Ferguson"]' UNION ALL
SELECT '[age = "19"]'
;WITH cte AS -- get the row number for each first name as a group separator
(
SELECT ROW_NUMBER() OVER(ORDER BY id) AS ROW,
*
FROM #temp
WHERE col LIKE '%firstname%'
),
cte2 AS -- find the ranges per group
(
SELECT
cte1.row,
cte1.id AS BeginSec,
CASE WHEN cte2.id IS NULL THEN (SELECT MAX(id) FROM #temp)+1 ELSE cte2.id END AS EndSec
FROM cte cte1
LEFT OUTER JOIN cte cte2
ON cte1.row = cte2.row-1
),
cte3 AS -- eliminate brackets and separate the value per column excluding quotation marks
(
SELECT cte2.row,REPLACE(REPLACE(col,'[',''),']','') AS sub,
REPLACE(SUBSTRING(col,CHARINDEX('=',col)+1,LEN(col)-(CHARINDEX('=',col)+1)),'"','') AS val
FROM cte2
INNER JOIN #temp
ON #temp.id>=BeginSec
AND #temp.id<EndSec
WHERE col LIKE '/[%' ESCAPE '/'
)
-- final output using CrossTabs
SELECT
MAX(CASE WHEN sub LIKE 'firstname%' THEN val ELSE '' END) AS firstname,
MAX(CASE WHEN sub LIKE 'surname%' THEN val ELSE '' END) AS surname,
MAX(CASE WHEN sub LIKE 'age%' THEN val ELSE '' END) AS age
FROM cte3
GROUP BY ROW
/* result set
firstnamesurnameage
John Scott 12
Barbara White 15
Alex Ferguson 19
*/
March 14, 2010 at 9:34 am
Thanks, Looks like it really does the job.
March 14, 2010 at 9:44 pm
wiernyjacek (3/13/2010)
I have loaded a text file into one colum:Table: tmpNames
===================
[firstname = "John"]
[surname = "Scott"]
[age = "12"]
[firstname = "Barbara"]
[surname = "White"]
[age = "15"]
..................
[firstname = "Alex"]
[surname = "Ferguson"]
[age = "19"]
Any tips and tricks on how I can transpose it a table with three columns, so I get
Table: myPeople
==============
firstname surname age
-----------------------------
John Scott 12
Barbara White 15
Alex Ferguson 19
Any help will be much appreciated!
Ummm... what do the first 4 lines of the text file actually look like? I ask because I believe you may be making it hard on yourself. Please remember not to post any private information.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 15, 2010 at 9:53 am
These are fake names, so no private information is sent here. However I analysed the file again and it looks like the age is not always given, so the file may look like this.
I have loaded a text file into one colum:
[firstname = "John"]
[surname = "Scott"]
[age = "12"]
[firstname = "Barbara"]
[surname = "White"]
[firstname = "Alex"]
[surname = "Ferguson"]
[age = "19"]
In such case the age should be inserted as NULL.
March 15, 2010 at 2:23 pm
wiernyjacek (3/15/2010)
These are fake names, so no private information is sent here. However I analysed the file again and it looks like the age is not always given, so the file may look like this.I have loaded a text file into one colum:
[firstname = "John"]
[surname = "Scott"]
[age = "12"]
[firstname = "Barbara"]
[surname = "White"]
[firstname = "Alex"]
[surname = "Ferguson"]
[age = "19"]
In such case the age should be inserted as NULL.
That doesn't help me help you... I think you may be loading the text file incorrectly. Please post the first 4 lines of the file so I can help here.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 17, 2010 at 5:02 am
Out of interest, I have come up with another variation which produces the same result as Lutz
IF NOT OBJECT_ID ('tempdb.dbo.#FOO', 'U') IS NULL
DROP TABLE #FOO
SELECT 'Table: tmpNames' AS BulkColumn INTO #FOO UNION ALL
SELECT '===================' UNION ALL
SELECT '[firstname = "John"]' UNION ALL
SELECT '[surname = "Scott"]' UNION ALL
SELECT '[age = "12"]' UNION ALL
SELECT '[firstname = "Barbara"]' UNION ALL
SELECT '[surname = "White"]' UNION ALL
SELECT '[age = "15"]' UNION ALL
SELECT '..................' UNION ALL
SELECT '[firstname = "Alex"]' UNION ALL
SELECT '[surname = "Ferguson"]' UNION ALL
SELECT '[age = "19"]'
;
WITH cteTally
AS
(
SELECT TOP 4000 ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS pk
FROM master.sys.All_Columns
)
,
cte2 AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,
* FROM #FOO
CROSS APPLY
(
SELECT
REPLACE([1], '[', '') AS COL1, REPLACE(REPLACE([2], '"', ''), ']', '') AS COL2
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,
NULLIF(SUBSTRING(BulkColumn + '=', pk, CHARINDEX('=', BulkColumn + '=', pk) - pk), '') AS Value
FROM cteTally
WHERE pk < LEN(BulkColumn) + 2 AND SUBSTRING('=' + BulkColumn + '=', pk, 1) = '='
) AS Z
PIVOT
(
MAX(Value) FOR ROW IN
(
[1],[2]
)
)
AS pvt
)
AS Y
WHERE LEFT(BulkColumn, 4) IN ('[fir', '[sur', '[age')
)
,
cte3 AS
(
SELECT
ROW - ((ROW - 1) % 3) AS block,
* FROM cte2
)
SELECT firstname, surname, age
FROM
(
SELECT block, COL1, COL2
FROM cte3
) AS pvt
PIVOT
(
MAX(COL2) FOR COL1 IN (firstname, surname, age)
) AS Y
March 17, 2010 at 9:23 am
The problem is that there's nothing in the bulk table to guarantee the order and I don't trust an unindexed heap to do so.
--Jeff Moden
Change is inevitable... Change for the better is not.
March 17, 2010 at 11:19 am
OK then, say we don't even risk putting the file into a table and run the CTE directly on the file using OPENROWSET BULK. Then in cte3 we can specifically set the ORDER BY clause. Seems to work OK.
WITH cteFOO
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,
BulkColumn FROM OPENROWSET (BULK 'C:\Transpose.txt', FORMATFILE = 'C:\BulkFile.fmt') AS Z
WHERE LEFT(BulkColumn, 4) IN ('[fir', '[sur', '[age')
)
,
cteTally
AS
(
SELECT TOP 4000 ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS pk
FROM master.sys.All_Columns
)
,
cte2 AS
(
SELECT
ROW, BulkColumn, COL1, COL2
FROM cteFOO
CROSS APPLY
(
SELECT
REPLACE([1], '[', '') AS COL1, REPLACE(REPLACE([2], '"', ''), ']', '') AS COL2
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) AS ROW,
NULLIF(SUBSTRING(BulkColumn + '=', pk, CHARINDEX('=', BulkColumn + '=', pk) - pk), '') AS Value
FROM cteTally
WHERE pk < LEN(BulkColumn) + 2 AND SUBSTRING('=' + BulkColumn + '=', pk, 1) = '='
) AS Z
PIVOT
(
MAX(Value) FOR ROW IN
(
[1],[2]
)
)
AS pvt
)
AS Y
)
,
cte3 AS
(
SELECT TOP 100 PERCENT
ROW - ((ROW - 1) % 3) AS block,
* FROM cte2 ORDER BY ROW
)
SELECT firstname, surname, age
FROM
(
SELECT block, COL1, COL2
FROM cte3
) AS pvt
PIVOT
(
MAX(COL2) FOR COL1 IN (firstname, surname, age)
) AS Y
Viewing 9 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply