Denormalizing Normalized data

Question

Denormalizing Normalized data

liamdemasi

Old Hand

Points: 324
More actions
April 14, 2015 at 7:24 am

#317400

I have a requirement to provide data in a denormalized form from normalized tables. Working in SSIS.
I have two tables: EmployeeCountry and Country.
EmployeeCountry
EmployeeId (PK)(FK)
CountryId(PK)(FK)
Country
CountryId (PK)
CountryName
There will only be a max of 3 Country entries for each Employee. So I want to select the EmployeeId and get the three CountryIds so it would look like this:
Employee
EmployeeId
CountryId1
CountryId2
CountryId3
Any suggestions?

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Sean Lange SSC Guru Points: 286609 More actions · Answer 1

liamdemasi (4/14/2015)
I have a requirement to provide data in a denormalized form from normalized tables. Working in SSIS.
I have two tables: EmployeeCountry and Country.
EmployeeCountry
EmployeeId (PK)(FK)
CountryId(PK)(FK)
Country
CountryId (PK)
CountryName
There will only be a max of 3 Country entries for each Employee. So I want to select the EmployeeId and get the three CountryIds so it would look like this:
Employee
EmployeeId
CountryId1
CountryId2
CountryId3
Any suggestions?

If you did this who would you know if a given value is an EmployeeId or a CountryId? Or is this only for a given EmployeeId?

If it is only for a given employeeId why not just use a UNION ALL

select @EmployeeId

UNION ALL

select ec.CountryId

from EmployeeCountry ec

where ec.EmployeeId = @EmployeeId

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

liamdemasi Old Hand Points: 324 More actions · Answer 2

They would know which is the EmployeeId and which are the CountryIds by the column headings in the table I will insert the data into.

I will attempt your suggestion and provide feedback.

Sean Lange SSC Guru Points: 286609 More actions · Answer 3

liamdemasi (4/14/2015)
They would know which is the EmployeeId and which are the CountryIds by the column headings in the table I will insert the data into.
I will attempt your suggestion and provide feedback.

That doesn't make much sense. If you are going to split back into two tables why denormalize it? If you have data for more than one employee all you are going to have is a list of IDs and no way to know what is what.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

liamdemasi Old Hand Points: 324 More actions · Answer 4

I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.

So the basic query would be something like:

SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001

So then I will have 3 rows of data with EmployeeId, Country Id in each row:

EmployeeID CountryId

10001 20001

10001 20002

10001 20003

Then I want to insert the CountryIds into the Employee table such as:

EmployeeId Country1Id Country2Id Country3Id

10001 20001 20002 20003

Sean Lange SSC Guru Points: 286609 More actions · Answer 5

liamdemasi (4/14/2015)
I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.
So the basic query would be something like:
SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001
So then I will have 3 rows of data with EmployeeId, Country Id in each row:
EmployeeID CountryId
10001 20001
10001 20002
10001 20003
Then I want to insert the CountryIds into the Employee table such as:
EmployeeId Country1Id Country2Id Country3Id
10001 20001 20002 20003

Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

liamdemasi Old Hand Points: 324 More actions · Answer 6

Unfortunately, I cannot normalize the target as the system that will be consuming the data needs it in a flat table.

Sean Lange SSC Guru Points: 286609 More actions · Answer 7

liamdemasi (4/14/2015)
Unfortunately, I cannot normalize the target as the system that will be consuming the data needs it in a flat table.

Then you will need to do this one employee at a time and also use a top X for CountryCodes since you could have more than the target can handle.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

liamdemasi Old Hand Points: 324 More actions · Answer 8

Right, I figured that. My real question, though, is how do I insert the data into the denormalized table?

My select: Select TOP 3 FROM EmployeeCountry where EmployeeId = 10001

Will give me:

EmployeeId CountryId

10001 20001

10001 20002

10001 20003

So how do I step through these to get them into the separate fields in my flat table?

EmployeeId CountryId1 CountryId2 CountryId3

Sean Lange SSC Guru Points: 286609 More actions · Answer 9

liamdemasi (4/14/2015)
Right, I figured that. My real question, though, is how do I insert the data into the denormalized table?
My select: Select TOP 3 FROM EmployeeCountry where EmployeeId = 10001
Will give me:
EmployeeId CountryId
10001 20001
10001 20002
10001 20003
So how do I step through these to get them into the separate fields in my flat table?
EmployeeId CountryId1 CountryId2 CountryId3

Gotcha. That is quite a bit different than your original post. 😉 No biggie. You can do this with a cross tab. Take a look at the links in my signature about how to do this. You won't need to use the dynamic version because you have a max number of columns thanks to the denormalized table on the other end.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Sean Lange SSC Guru Points: 286609 More actions · Answer 10

Something like this should be pretty close. Since you didn't post ddl or data this is untested but it should work.

declare @EmployeeID int = 10001;

with SortedVals as

(

select CountryId

, ROW_NUMBER () over (order by CountryID) as RowNum

FROM EmployeeCountry

where EmployeeId = @EmployeeID

)

Select @EmployeeID

, MAX(case when RowNum = 1 then CountryID end) as CountryID1

, MAX(case when RowNum = 2 then CountryID end) as CountryID2

, MAX(case when RowNum = 3 then CountryID end) as CountryID3

from SortedVals

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

x SSC-Insane Points: 23660 More actions · Answer 11

Sean Lange (4/14/2015)
liamdemasi (4/14/2015)
I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.
So the basic query would be something like:
SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001
So then I will have 3 rows of data with EmployeeId, Country Id in each row:
EmployeeID CountryId
10001 20001
10001 20002
10001 20003
Then I want to insert the CountryIds into the Employee table such as:
EmployeeId Country1Id Country2Id Country3Id
10001 20001 20002 20003
Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.

"There will only be a max of 3 Country entries for each Employee."

I've found this particular denormalization to be a popular request in my experience. Since the number of countries will be limitted to those three columns, I'd probably just use three updates, assuming we can take advantage of the sortability of the countryid.

I'm also just as sure that there are more clever solutions out there 😉

but anyways, this is my guess!

create table EmployeeCountry

(

EmployeeID int,

CountryID int

)

create table Employee

(

EmployeeID int,

Country1ID int,

Country2ID int,

Country3ID int

)

insert into EmployeeCountry select 10001,20001

insert into EmployeeCountry select 10001,20002

insert into EmployeeCountry select 10001,20003

insert into EmployeeCountry select 10002,20001

insert into EmployeeCountry select 10002,20004

insert into EmployeeCountry select 10002,20005

insert into EmployeeCountry select 10003,20005

insert into EmployeeCountry select 10003,20006

insert into Employee select 10001, null, null, null

insert into Employee select 10002, null, null, null

insert into Employee select 10003, null, null, null

UPDATE Employee SET Country1Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID)

UPDATE Employee SET Country2Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

AND CountryID <> Country1Id)

UPDATE Employee SET Country3Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

AND CountryID <> Country1Id AND CountryID <> Country2Id)

select * from Employee

select * from EmployeeCountry

Sean Lange SSC Guru Points: 286609 More actions · Answer 12

patrickmcginnis59 10839 (4/14/2015)
Sean Lange (4/14/2015)
liamdemasi (4/14/2015)
I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.
So the basic query would be something like:
SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001
So then I will have 3 rows of data with EmployeeId, Country Id in each row:
EmployeeID CountryId
10001 20001
10001 20002
10001 20003
Then I want to insert the CountryIds into the Employee table such as:
EmployeeId Country1Id Country2Id Country3Id
10001 20001 20002 20003
Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.
"There will only be a max of 3 Country entries for each Employee."
I've found this particular denormalization to be a popular request in my experience. Since the number of countries will be limitted to those three columns, I'd probably just use three updates, assuming we can take advantage of the sortability of the countryid.
I'm also just as sure that there are more clever solutions out there 😉
but anyways, this is my guess!
create table EmployeeCountry
(
EmployeeID int,
CountryID int
)
create table Employee
(
EmployeeID int,
Country1ID int,
Country2ID int,
Country3ID int
)
insert into EmployeeCountry select 10001,20001
insert into EmployeeCountry select 10001,20002
insert into EmployeeCountry select 10001,20003
insert into EmployeeCountry select 10002,20001
insert into EmployeeCountry select 10002,20004
insert into EmployeeCountry select 10002,20005
insert into EmployeeCountry select 10003,20005
insert into EmployeeCountry select 10003,20006
insert into Employee select 10001, null, null, null
insert into Employee select 10002, null, null, null
insert into Employee select 10003, null, null, null
UPDATE Employee SET Country1Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID)
UPDATE Employee SET Country2Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID
AND CountryID <> Country1Id)
UPDATE Employee SET Country3Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID
AND CountryID <> Country1Id AND CountryID <> Country2Id)
select * from Employee
select * from EmployeeCountry

For a more "clever" solution you could look at the cross tab I posted. It would be a single query instead of 4 for each row in the target.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

x SSC-Insane Points: 23660 More actions · Answer 13

Sean Lange (4/14/2015)
patrickmcginnis59 10839 (4/14/2015)
Sean Lange (4/14/2015)
liamdemasi (4/14/2015)
I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.
So the basic query would be something like:
SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001
So then I will have 3 rows of data with EmployeeId, Country Id in each row:
EmployeeID CountryId
10001 20001
10001 20002
10001 20003
Then I want to insert the CountryIds into the Employee table such as:
EmployeeId Country1Id Country2Id Country3Id
10001 20001 20002 20003
Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.
"There will only be a max of 3 Country entries for each Employee."
I've found this particular denormalization to be a popular request in my experience. Since the number of countries will be limitted to those three columns, I'd probably just use three updates, assuming we can take advantage of the sortability of the countryid.
I'm also just as sure that there are more clever solutions out there 😉
but anyways, this is my guess!
create table EmployeeCountry
(
EmployeeID int,
CountryID int
)
create table Employee
(
EmployeeID int,
Country1ID int,
Country2ID int,
Country3ID int
)
insert into EmployeeCountry select 10001,20001
insert into EmployeeCountry select 10001,20002
insert into EmployeeCountry select 10001,20003
insert into EmployeeCountry select 10002,20001
insert into EmployeeCountry select 10002,20004
insert into EmployeeCountry select 10002,20005
insert into EmployeeCountry select 10003,20005
insert into EmployeeCountry select 10003,20006
insert into Employee select 10001, null, null, null
insert into Employee select 10002, null, null, null
insert into Employee select 10003, null, null, null
UPDATE Employee SET Country1Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID)
UPDATE Employee SET Country2Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID
AND CountryID <> Country1Id)
UPDATE Employee SET Country3Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID
AND CountryID <> Country1Id AND CountryID <> Country2Id)
select * from Employee
select * from EmployeeCountry
For a more "clever" solution you could look at the cross tab I posted. It would be a single query instead of 4 for each row in the target.

The reason I'm partial to mine, is that its three updates for all the employees instead of row by row.

edit: its actually one update per column. I've used something similar and if you have a giant stack of employees, it seems to be pretty efficient. Now the downside is that the subqueries might impact things, but I remember that for several thousand (scores, in my case), it was very nice for my use case.

Sean Lange SSC Guru Points: 286609 More actions · Answer 14

patrickmcginnis59 10839 (4/14/2015)
Sean Lange (4/14/2015)
patrickmcginnis59 10839 (4/14/2015)
Sean Lange (4/14/2015)
liamdemasi (4/14/2015)
I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.
So the basic query would be something like:
SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001
So then I will have 3 rows of data with EmployeeId, Country Id in each row:
EmployeeID CountryId
10001 20001
10001 20002
10001 20003
Then I want to insert the CountryIds into the Employee table such as:
EmployeeId Country1Id Country2Id Country3Id
10001 20001 20002 20003
Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.
"There will only be a max of 3 Country entries for each Employee."
I've found this particular denormalization to be a popular request in my experience. Since the number of countries will be limitted to those three columns, I'd probably just use three updates, assuming we can take advantage of the sortability of the countryid.
I'm also just as sure that there are more clever solutions out there 😉
but anyways, this is my guess!
create table EmployeeCountry
(
EmployeeID int,
CountryID int
)
create table Employee
(
EmployeeID int,
Country1ID int,
Country2ID int,
Country3ID int
)
insert into EmployeeCountry select 10001,20001
insert into EmployeeCountry select 10001,20002
insert into EmployeeCountry select 10001,20003
insert into EmployeeCountry select 10002,20001
insert into EmployeeCountry select 10002,20004
insert into EmployeeCountry select 10002,20005
insert into EmployeeCountry select 10003,20005
insert into EmployeeCountry select 10003,20006
insert into Employee select 10001, null, null, null
insert into Employee select 10002, null, null, null
insert into Employee select 10003, null, null, null
UPDATE Employee SET Country1Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID)
UPDATE Employee SET Country2Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID
AND CountryID <> Country1Id)
UPDATE Employee SET Country3Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID
AND CountryID <> Country1Id AND CountryID <> Country2Id)
select * from Employee
select * from EmployeeCountry
For a more "clever" solution you could look at the cross tab I posted. It would be a single query instead of 4 for each row in the target.
The reason I'm partial to mine, is that its three updates for all the employees instead of row by row.
edit: its actually one update per column. I've used something similar and if you have a giant stack of employees, it seems to be pretty efficient. Now the downside is that the subqueries might impact things, but I remember that for several thousand (scores, in my case), it was very nice for my use case.

OK. The point is that you have to do an insert (pulling from the base table) and then 3 updates (also pulling data from the base table). This makes 4 queries for every row in the target. The way I used is a single query for a row in the target table. It is about set based logic instead of RBAR (or in this case CBAC). Sure it will work but it will take 4 times longer because there are 4 times as many queries.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/