Normalize data using Pivot WITHOUT using Aggregate function

Question

Normalize data using Pivot WITHOUT using Aggregate function

jjbear11

Default port

Points: 1497
More actions
December 29, 2009 at 8:13 am

#228768

I've searched the forums here (which usually have a wealth of knowledge), but have been unable to find an answer to my problem. Below is the layout of my table as it is currently:
Below is how I want the table to turn out:
TABLE CURRENTLY:
UniqueIDColumnNameText
1ID123146
2SubjectAreaMath
3LocalCourseCode000A1
4SchoolID123
5SourceHub
6CourseTitleAlgebra I
7ID94503
8SubjectAreaScience
9LocalCourseCode9037
10SchoolID46
11SourceAnnex
12CourseTitleBiology II
WHAT I WANT:
IDSubjectAreaLocalCourseCodeSchoolIDSourceCourseTitle
123146Math000A1123HubAlgebra I
94503Science903746AnnexBiology II
I am currently using the PIVOT function in SQL Server 2008. However, using this function still requires the use of an aggregate, which kills what I'm trying to do. I am currently using min(text) for ColumnName in (<ColumnNames>).
However, that does not produce correct results. I would like to use the PIVOT function because of speed (original table has appx 900,000 rows).
Any help would be much appreciated!

Viewing 15 posts - 1 through 15 (of 50 total)

You must be logged in to reply to this topic. Login to reply

Ray K SSC-Dedicated Points: 31148 More actions · Answer 1

I'd like to take a crack at this. Can you put your data into an actual text format (not an image) so I can try it out? (It'd be helpful if you can put your data into a table variable that mimics your sample data.)

Just curious -- what is your PIVOT actually giving you? If I'm not mistaken, I don't think you can run a PIVOT without using an aggregate. I messed around with it a while back, and couldn't get it to work without the aggregate. I ended up setting up a "dummy" aggregate to get it to work.

Hopefully someone who knows PIVOT better than me can answer that better.

+--------------------------------------------------------------------------------------+
Check out my blog at https://pianorayk.wordpress.com/

jjbear11 Default port Points: 1497 More actions · Answer 2

Please see my original post for code to copy/paste.

I also setup a "dummy" aggregate to see if I could get that to work -- I used min(text) for columnname. However, using that aggregate with the data which I'm presenting produces problems:

1. How to determine min(text) when it is possible for the TEXT of a particular column to contain both Numeric and Alphabetic data?

2. In my example data, for the ID column, we know that 95403 is lower than 123146. However, when doing Min(text), I will get 123146 returned, since the lowest of the first character of the number (1 vs 9) is 1.

These are just a smattering of the issues which I am running into.

Ray K SSC-Dedicated Points: 31148 More actions · Answer 3

Thanks . . . actually, what I was looking for was this (I'm posting this here for anyone else who wants to try this problem):

declare @school table ([ID] int, ColumnName varchar(15), [Text] varchar(50))

insert into @school values (1, 'ID', '123146')

insert into @school values (2, 'SubjectArea', 'Math')

insert into @school values (3, 'LocalCourseCode', '000A1')

insert into @school values (4, 'SchoolID', '123')

insert into @school values (5, 'Source', 'Hub')

insert into @school values (6, 'CourseTitle', 'Algebra I')

insert into @school values (7, 'ID', '94503')

insert into @school values (8, 'SubjectArea', 'Science')

insert into @school values (9, 'LocalCourseCode', '9037')

insert into @school values (10, 'SchoolID', '46')

insert into @school values (11, 'Source', 'Annex')

insert into @school values (12, 'CourseTitle', 'Biology II')

select * from @school

Just for your future reference -- if you post/present your data like this, you'll probably get faster responses! 🙂 (Basically, help us help you -- the less work you make for everyone, the sooner people will respond!)

I guessed on the field types -- let me know if they look correct!

Let me mess around with this a little bit. If I come up with anything useful, I'll post again!

+--------------------------------------------------------------------------------------+
Check out my blog at https://pianorayk.wordpress.com/

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 4

Hi Justin

This might sound daft, but bear with me...

How do you know that UniqueID 4 is the school for ID 123146, and not UniqueID 10? Or to put it ab

another way, is each block of 6 consecutive UniqueID's destined to become a single row in your output?

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

jjbear11 Default port Points: 1497 More actions · Answer 5

Yes, that is correct. Each block of 6 is destined to become a singular row in the final table.

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 6

Justin James (12/29/2009)
Yes, that is correct. Each block of 6 is destined to become a singular row in the final table.

Is there a missing column which would uniquely identify each block of six rows? Don't worry if there isn't because there's a workaround, but it would help a little.

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

jjbear11 Default port Points: 1497 More actions · Answer 7

There is not. I was thinking of using ROW_NUMBER(), but was unsure how to partition by each block of 6 rows.

Lowell SSC Guru Points: 323504 More actions · Answer 8

Ray K (12/29/2009)
Just for your future reference -- if you post/present your data like this, you'll probably get faster responses! 🙂 (Basically, help us help you -- the less work you make for everyone, the sooner people will respond!)

Ray is sooooo right;

once the data is in a consumable format, I jumped at trying my hand at this; if you can, always give us the tables and inserts...

here's my version, and my example results; note the last column is off by one row, and i'm working on that, but this is getting fairly close to the desired results:

the same results:

REC ID SubjectArea LocalCourseCode SchoolID Source CourseTitle

----------- ------------ ---------------- --------- ------- ------------

0 123146 Math 000A1 123 Hub NULL

1 94503 Science 9037 46 Annex Algebra I

2 NULL NULL NULL NULL NULL Biology II

the test code

declare @school table ([ID] int, ColumnName varchar(15), [Text] varchar(50))

insert into @school values (1, 'ID', '123146')

insert into @school values (2, 'SubjectArea', 'Math')

insert into @school values (3, 'LocalCourseCode', '000A1')

insert into @school values (4, 'SchoolID', '123')

insert into @school values (5, 'Source', 'Hub')

insert into @school values (6, 'CourseTitle', 'Algebra I')

insert into @school values (7, 'ID', '94503')

insert into @school values (8, 'SubjectArea', 'Science')

insert into @school values (9, 'LocalCourseCode', '9037')

insert into @school values (10, 'SchoolID', '46')

insert into @school values (11, 'Source', 'Annex')

insert into @school values (12, 'CourseTitle', 'Biology II')

--select * from @school

SELECT REC,

[1] AS ID,

[2] AS SubjectArea,

[3] AS LocalCourseCode,

[4] AS SchoolID,

[5] AS Source,

[0] AS CourseTitle

FROM (SELECT [ID]/ 6 As REC,[ID]% 6 AS GRP,[Text] from @school) As TheSource

PIVOT

(

min([Text])

FOR

GRP --each value

IN ( [0], [1], [2], [3], [4], [5],[6])

) AS PIVOTALIAS

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

ChrisM@home SSC-Insane Points: 24260 More actions · Answer 9

Justin James (12/29/2009)
There is not. I was thinking of using ROW_NUMBER(), but was unsure how to partition by each block of 6 rows.

Like this:

;WITH CTE1 AS (

SELECT ((UniqueID-1)/6)+1 AS CourseBlock, [ID], ColumnName, [Text]

FROM YourTable

),

CTE2 AS (

SELECT CourseBlock,

ROW_NUMBER() OVER(PARTITION BY CourseBlock ORDER BY [ID]) AS CourseElement,

[ID], ColumnName, [Text]

FROM CTE1

)

SELECT * FROM CTE2

Edit: Currently unable to test, eval version expired during vacation

[font="Arial"]^{Low-hanging fruit picker and defender of the moggies}[/font]

For better assistance in answering your questions, please read this[/url].

Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

jjbear11 Default port Points: 1497 More actions · Answer 10

Thank you for the help Lowell. I'll make sure to include the appropriate SQL in my further posts.

One question I have about your SQL:

select ID/6 as REC, [ID]% 6 as GRP, TEXT from @school

What if the number of rows in each group is odd....say, 7? I tried changing the value to an odd number, and the results were skewed.

Ray K SSC-Dedicated Points: 31148 More actions · Answer 11

lobbymuncher (12/29/2009)
;WITH CTE1 AS (
SELECT ((UniqueID-1)/6)+1 AS CourseBlock, [ID], ColumnName, [Text]
FROM YourTable
),
CTE2 AS (
SELECT CourseBlock,
ROW_NUMBER() OVER(PARTITION BY CourseBlock ORDER BY [ID]) AS CourseElement,
[ID], ColumnName, [Text]
FROM CTE1
)
SELECT * FROM CTE2

Ummm . . . this is really convoluted (and total overkill, IMHO). Wouldn't it be better to just do this (or something similar):

select cast ([ID]/6 as int) from

Don't yet have an answer to your original question. I'm starting to think that a CTE (rather than a PIVOT) might be the way to go.

Any other suggestions from the peanut gallery?

+--------------------------------------------------------------------------------------+
Check out my blog at https://pianorayk.wordpress.com/

Lowell SSC Guru Points: 323504 More actions · Answer 12

Justin James (12/29/2009)
Thank you for the help Lowell. I'll make sure to include the appropriate SQL in my further posts.
One question I have about your SQL:
select ID/6 as REC, [ID]% 6 as GRP, TEXT from @school
What if the number of rows in each group is odd....say, 7? I tried changing the value to an odd number, and the results were skewed.

I'm assuming that the number of rows is alwasy exactly 6 items...if you have some groups that are 7, and others that are 6, we need a different strategy. The sample data is currently 6 items to a group.

All I'm doing is a different technique to get the same results that lobbymuncher posted.

you said that each group of 6 entries were a record...so i'm using integer division of ID / 6 to get a group for 6 items, and integer modulous to get 6 unique values from ID % 6

any number % 6 will return one of the following values: 1,2,3,4,5 or 0, which is what i was trying to use to group the values.

still not quite the way i had hoped it would work, but it is really close....

Lowell

--help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

jjbear11 Default port Points: 1497 More actions · Answer 13

Lowell (12/29/2009)
Justin James (12/29/2009)
Thank you for the help Lowell. I'll make sure to include the appropriate SQL in my further posts.
One question I have about your SQL:
select ID/6 as REC, [ID]% 6 as GRP, TEXT from @school
What if the number of rows in each group is odd....say, 7? I tried changing the value to an odd number, and the results were skewed.
I'm assuming that the number of rows is alwasy exactly 6 items...if you have some groups that are 7, and others that are 6, we need a different strategy. The sample data is currently 6 items to a group.
All I'm doing is a different technique to get the same results that lobbymuncher posted.
you said that each group of 6 entries were a record...so i'm using integer division of ID / 6 to get a group for 6 items, and integer modulous to get 6 unique values from ID % 6
any number % 6 will return one of the following values: 1,2,3,4,5 or 0, which is what i was trying to use to group the values.
still not quite the way i had hoped it would work, but it is really close....

As I looked through my data further (what I provided was just a subset), I see that the grouping of information change throughout it. :(. However, on the BRIGHT side (!), there is a way to tell when a new group occurs. A new group occurs whenever the value in the TEXT column is [DW].[DimClass].

That being said, here is the new test code:

declare @school table ([ID] int, ColumnName varchar(15), [Text] varchar(50))

--Group 1

insert into @school values (0, 'DestinationTableName', '[DW].[DimClass]')

insert into @school values (1, 'ID', '123146')

insert into @school values (2, 'SubjectArea', 'Math')

insert into @school values (3, 'LocalCourseCode', '000A1')

insert into @school values (4, 'SchoolID', '123')

insert into @school values (5, 'Source', 'Hub')

insert into @school values (6, 'CourseTitle', 'Algebra I')

--Group 2

insert into @school values (7, 'DestinationTableName', '[DW].[DimClass]')

insert into @school values (8, 'ID', '94503')

insert into @school values (9, 'SubjectArea', 'Science')

insert into @school values (10, 'LocalCourseCode', '9037')

insert into @school values (12, 'SchoolID', '46')

insert into @school values (12, 'Source', 'Annex')

insert into @school values (13, 'CourseTitle', 'Biology II')

insert into @school values (14, 'TeacherID', '58398')

insert into @school values (15, 'Credits Possible', '1.0')

--Group 3

insert into @school values (16, 'DestinationTableName', '[DW].[DimClass]')

insert into @school values (17, 'ID', '84023')

insert into @school values (18, 'Source', 'Annex')

insert into @school values (19, 'CourseTitle', 'Physics I')

select * from @school

Ray K SSC-Dedicated Points: 31148 More actions · Answer 14

Justin James (12/29/2009)
As I looked through my data further (what I provided was just a subset), I see that the grouping of information change throughout it. 🙁

Argh! My model just got blown out of the water!

Also, I got truncation errors when I ran the sample data code. I bumped the varchar(15) up to varchar(25), and that seemed to fix it.

+--------------------------------------------------------------------------------------+
Check out my blog at https://pianorayk.wordpress.com/