Denormalizing Normalized data

  • I have a requirement to provide data in a denormalized form from normalized tables. Working in SSIS.

    I have two tables: EmployeeCountry and Country.

    EmployeeCountry

    EmployeeId (PK)(FK)

    CountryId(PK)(FK)

    Country

    CountryId (PK)

    CountryName

    There will only be a max of 3 Country entries for each Employee. So I want to select the EmployeeId and get the three CountryIds so it would look like this:

    Employee

    EmployeeId

    CountryId1

    CountryId2

    CountryId3

    Any suggestions?

  • liamdemasi (4/14/2015)


    I have a requirement to provide data in a denormalized form from normalized tables. Working in SSIS.

    I have two tables: EmployeeCountry and Country.

    EmployeeCountry

    EmployeeId (PK)(FK)

    CountryId(PK)(FK)

    Country

    CountryId (PK)

    CountryName

    There will only be a max of 3 Country entries for each Employee. So I want to select the EmployeeId and get the three CountryIds so it would look like this:

    Employee

    EmployeeId

    CountryId1

    CountryId2

    CountryId3

    Any suggestions?

    If you did this who would you know if a given value is an EmployeeId or a CountryId? Or is this only for a given EmployeeId?

    If it is only for a given employeeId why not just use a UNION ALL

    select @EmployeeId

    UNION ALL

    select ec.CountryId

    from EmployeeCountry ec

    where ec.EmployeeId = @EmployeeId

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • They would know which is the EmployeeId and which are the CountryIds by the column headings in the table I will insert the data into.

    I will attempt your suggestion and provide feedback.

  • liamdemasi (4/14/2015)


    They would know which is the EmployeeId and which are the CountryIds by the column headings in the table I will insert the data into.

    I will attempt your suggestion and provide feedback.

    That doesn't make much sense. If you are going to split back into two tables why denormalize it? If you have data for more than one employee all you are going to have is a list of IDs and no way to know what is what.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.

    So the basic query would be something like:

    SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001

    So then I will have 3 rows of data with EmployeeId, Country Id in each row:

    EmployeeID CountryId

    10001 20001

    10001 20002

    10001 20003

    Then I want to insert the CountryIds into the Employee table such as:

    EmployeeId Country1Id Country2Id Country3Id

    10001 20001 20002 20003

  • liamdemasi (4/14/2015)


    I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.

    So the basic query would be something like:

    SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001

    So then I will have 3 rows of data with EmployeeId, Country Id in each row:

    EmployeeID CountryId

    10001 20001

    10001 20002

    10001 20003

    Then I want to insert the CountryIds into the Employee table such as:

    EmployeeId Country1Id Country2Id Country3Id

    10001 20001 20002 20003

    Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Unfortunately, I cannot normalize the target as the system that will be consuming the data needs it in a flat table.

  • liamdemasi (4/14/2015)


    Unfortunately, I cannot normalize the target as the system that will be consuming the data needs it in a flat table.

    Then you will need to do this one employee at a time and also use a top X for CountryCodes since you could have more than the target can handle.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Right, I figured that. My real question, though, is how do I insert the data into the denormalized table?

    My select: Select TOP 3 FROM EmployeeCountry where EmployeeId = 10001

    Will give me:

    EmployeeId CountryId

    10001 20001

    10001 20002

    10001 20003

    So how do I step through these to get them into the separate fields in my flat table?

    EmployeeId CountryId1 CountryId2 CountryId3

  • liamdemasi (4/14/2015)


    Right, I figured that. My real question, though, is how do I insert the data into the denormalized table?

    My select: Select TOP 3 FROM EmployeeCountry where EmployeeId = 10001

    Will give me:

    EmployeeId CountryId

    10001 20001

    10001 20002

    10001 20003

    So how do I step through these to get them into the separate fields in my flat table?

    EmployeeId CountryId1 CountryId2 CountryId3

    Gotcha. That is quite a bit different than your original post. 😉 No biggie. You can do this with a cross tab. Take a look at the links in my signature about how to do this. You won't need to use the dynamic version because you have a max number of columns thanks to the denormalized table on the other end.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Something like this should be pretty close. Since you didn't post ddl or data this is untested but it should work.

    declare @EmployeeID int = 10001;

    with SortedVals as

    (

    select CountryId

    , ROW_NUMBER () over (order by CountryID) as RowNum

    FROM EmployeeCountry

    where EmployeeId = @EmployeeID

    )

    Select @EmployeeID

    , MAX(case when RowNum = 1 then CountryID end) as CountryID1

    , MAX(case when RowNum = 2 then CountryID end) as CountryID2

    , MAX(case when RowNum = 3 then CountryID end) as CountryID3

    from SortedVals

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (4/14/2015)


    liamdemasi (4/14/2015)


    I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.

    So the basic query would be something like:

    SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001

    So then I will have 3 rows of data with EmployeeId, Country Id in each row:

    EmployeeID CountryId

    10001 20001

    10001 20002

    10001 20003

    Then I want to insert the CountryIds into the Employee table such as:

    EmployeeId Country1Id Country2Id Country3Id

    10001 20001 20002 20003

    Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.

    "There will only be a max of 3 Country entries for each Employee."

    I've found this particular denormalization to be a popular request in my experience. Since the number of countries will be limitted to those three columns, I'd probably just use three updates, assuming we can take advantage of the sortability of the countryid.

    I'm also just as sure that there are more clever solutions out there 😉

    but anyways, this is my guess!

    create table EmployeeCountry

    (

    EmployeeID int,

    CountryID int

    )

    create table Employee

    (

    EmployeeID int,

    Country1ID int,

    Country2ID int,

    Country3ID int

    )

    insert into EmployeeCountry select 10001,20001

    insert into EmployeeCountry select 10001,20002

    insert into EmployeeCountry select 10001,20003

    insert into EmployeeCountry select 10002,20001

    insert into EmployeeCountry select 10002,20004

    insert into EmployeeCountry select 10002,20005

    insert into EmployeeCountry select 10003,20005

    insert into EmployeeCountry select 10003,20006

    insert into Employee select 10001, null, null, null

    insert into Employee select 10002, null, null, null

    insert into Employee select 10003, null, null, null

    UPDATE Employee SET Country1Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID)

    UPDATE Employee SET Country2Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

    AND CountryID <> Country1Id)

    UPDATE Employee SET Country3Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

    AND CountryID <> Country1Id AND CountryID <> Country2Id)

    select * from Employee

    select * from EmployeeCountry

  • patrickmcginnis59 10839 (4/14/2015)


    Sean Lange (4/14/2015)


    liamdemasi (4/14/2015)


    I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.

    So the basic query would be something like:

    SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001

    So then I will have 3 rows of data with EmployeeId, Country Id in each row:

    EmployeeID CountryId

    10001 20001

    10001 20002

    10001 20003

    Then I want to insert the CountryIds into the Employee table such as:

    EmployeeId Country1Id Country2Id Country3Id

    10001 20001 20002 20003

    Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.

    "There will only be a max of 3 Country entries for each Employee."

    I've found this particular denormalization to be a popular request in my experience. Since the number of countries will be limitted to those three columns, I'd probably just use three updates, assuming we can take advantage of the sortability of the countryid.

    I'm also just as sure that there are more clever solutions out there 😉

    but anyways, this is my guess!

    create table EmployeeCountry

    (

    EmployeeID int,

    CountryID int

    )

    create table Employee

    (

    EmployeeID int,

    Country1ID int,

    Country2ID int,

    Country3ID int

    )

    insert into EmployeeCountry select 10001,20001

    insert into EmployeeCountry select 10001,20002

    insert into EmployeeCountry select 10001,20003

    insert into EmployeeCountry select 10002,20001

    insert into EmployeeCountry select 10002,20004

    insert into EmployeeCountry select 10002,20005

    insert into EmployeeCountry select 10003,20005

    insert into EmployeeCountry select 10003,20006

    insert into Employee select 10001, null, null, null

    insert into Employee select 10002, null, null, null

    insert into Employee select 10003, null, null, null

    UPDATE Employee SET Country1Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID)

    UPDATE Employee SET Country2Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

    AND CountryID <> Country1Id)

    UPDATE Employee SET Country3Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

    AND CountryID <> Country1Id AND CountryID <> Country2Id)

    select * from Employee

    select * from EmployeeCountry

    For a more "clever" solution you could look at the cross tab I posted. It would be a single query instead of 4 for each row in the target.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (4/14/2015)


    patrickmcginnis59 10839 (4/14/2015)


    Sean Lange (4/14/2015)


    liamdemasi (4/14/2015)


    I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.

    So the basic query would be something like:

    SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001

    So then I will have 3 rows of data with EmployeeId, Country Id in each row:

    EmployeeID CountryId

    10001 20001

    10001 20002

    10001 20003

    Then I want to insert the CountryIds into the Employee table such as:

    EmployeeId Country1Id Country2Id Country3Id

    10001 20001 20002 20003

    Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.

    "There will only be a max of 3 Country entries for each Employee."

    I've found this particular denormalization to be a popular request in my experience. Since the number of countries will be limitted to those three columns, I'd probably just use three updates, assuming we can take advantage of the sortability of the countryid.

    I'm also just as sure that there are more clever solutions out there 😉

    but anyways, this is my guess!

    create table EmployeeCountry

    (

    EmployeeID int,

    CountryID int

    )

    create table Employee

    (

    EmployeeID int,

    Country1ID int,

    Country2ID int,

    Country3ID int

    )

    insert into EmployeeCountry select 10001,20001

    insert into EmployeeCountry select 10001,20002

    insert into EmployeeCountry select 10001,20003

    insert into EmployeeCountry select 10002,20001

    insert into EmployeeCountry select 10002,20004

    insert into EmployeeCountry select 10002,20005

    insert into EmployeeCountry select 10003,20005

    insert into EmployeeCountry select 10003,20006

    insert into Employee select 10001, null, null, null

    insert into Employee select 10002, null, null, null

    insert into Employee select 10003, null, null, null

    UPDATE Employee SET Country1Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID)

    UPDATE Employee SET Country2Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

    AND CountryID <> Country1Id)

    UPDATE Employee SET Country3Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

    AND CountryID <> Country1Id AND CountryID <> Country2Id)

    select * from Employee

    select * from EmployeeCountry

    For a more "clever" solution you could look at the cross tab I posted. It would be a single query instead of 4 for each row in the target.

    The reason I'm partial to mine, is that its three updates for all the employees instead of row by row.

    edit: its actually one update per column. I've used something similar and if you have a giant stack of employees, it seems to be pretty efficient. Now the downside is that the subqueries might impact things, but I remember that for several thousand (scores, in my case), it was very nice for my use case.

  • patrickmcginnis59 10839 (4/14/2015)


    Sean Lange (4/14/2015)


    patrickmcginnis59 10839 (4/14/2015)


    Sean Lange (4/14/2015)


    liamdemasi (4/14/2015)


    I am not splitting it back into 2 tables. I am trying to get the CountryIds associated with an EmployeeId in the EmployeeCountry table, then I am going to insert it into an Employee table that has the columns EmployeeId, Country1Id, Country2Id, Country3Id.

    So the basic query would be something like:

    SELECT EmployeeId, CountryId from EmployeeCountry where EmployeeId = 10001

    So then I will have 3 rows of data with EmployeeId, Country Id in each row:

    EmployeeID CountryId

    10001 20001

    10001 20002

    10001 20003

    Then I want to insert the CountryIds into the Employee table such as:

    EmployeeId Country1Id Country2Id Country3Id

    10001 20001 20002 20003

    Ugh. Any chance you can normalize the target? If not, you need to make sure you limit the number of countryIds so the new table has a place for all of them.

    "There will only be a max of 3 Country entries for each Employee."

    I've found this particular denormalization to be a popular request in my experience. Since the number of countries will be limitted to those three columns, I'd probably just use three updates, assuming we can take advantage of the sortability of the countryid.

    I'm also just as sure that there are more clever solutions out there 😉

    but anyways, this is my guess!

    create table EmployeeCountry

    (

    EmployeeID int,

    CountryID int

    )

    create table Employee

    (

    EmployeeID int,

    Country1ID int,

    Country2ID int,

    Country3ID int

    )

    insert into EmployeeCountry select 10001,20001

    insert into EmployeeCountry select 10001,20002

    insert into EmployeeCountry select 10001,20003

    insert into EmployeeCountry select 10002,20001

    insert into EmployeeCountry select 10002,20004

    insert into EmployeeCountry select 10002,20005

    insert into EmployeeCountry select 10003,20005

    insert into EmployeeCountry select 10003,20006

    insert into Employee select 10001, null, null, null

    insert into Employee select 10002, null, null, null

    insert into Employee select 10003, null, null, null

    UPDATE Employee SET Country1Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID)

    UPDATE Employee SET Country2Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

    AND CountryID <> Country1Id)

    UPDATE Employee SET Country3Id = (SELECT MIN (CountryId) FROM EmployeeCountry EC WHERE EC.EmployeeID = Employee.EmployeeID

    AND CountryID <> Country1Id AND CountryID <> Country2Id)

    select * from Employee

    select * from EmployeeCountry

    For a more "clever" solution you could look at the cross tab I posted. It would be a single query instead of 4 for each row in the target.

    The reason I'm partial to mine, is that its three updates for all the employees instead of row by row.

    edit: its actually one update per column. I've used something similar and if you have a giant stack of employees, it seems to be pretty efficient. Now the downside is that the subqueries might impact things, but I remember that for several thousand (scores, in my case), it was very nice for my use case.

    OK. The point is that you have to do an insert (pulling from the base table) and then 3 updates (also pulling data from the base table). This makes 4 queries for every row in the target. The way I used is a single query for a row in the target table. It is about set based logic instead of RBAR (or in this case CBAC). Sure it will work but it will take 4 times longer because there are 4 times as many queries.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply