February 14, 2012 at 11:50 am
I am hoping to use SSIS to export data from multiple spreadsheets to multiple MS SQL tables that have foreign keys on them.
This package would need to be reuable. The one requirement that we have is that the spreadsheet to SQL tables must be a 1 to 1 relationship.
I have two issues I'm running into.
1. How should the data in the spreadsheet be formatted to allow for the foreign key?
2. I've not got alot of experience in SSIS - do you think this can be done efficiently using SSIS?
Here is a simple classic example of what I want to accomplish.
I would like to convert city, state and postal codes from spreadsheets
In the SQL database we have a separate table for each of those areas and they have foreign key contraints on each other.
This is my working copy of how I think the spreadsheets kind of need to look...but I am not sure of the best way to handle the foreign key issues. I realize that what I have below won't work as is. But I want to keep this as easy as possible for the user to setup the spreadsheets. Any suggestions you have would be greatly appreciated.
State Spreadsheet
State State AbbrevCountry
South DakotaSD USA
North DakotaND USA
MinnesotaMN USA
Iowa IA USA
City Spreadsheet
Key to StateCity
SD Mitchell(need to tie to SD)
ND Fargo
MN Saint Paul
IA Des Moine
MN Mitchell (need to tie to MN)
Postal Spreadsheet
Key to CityPostal Code
SD/Mitchell57301
ND/Fargo 43809
MN/Saint Paul32879
IA/Des Moine85644
MN/Mitchell15488
February 14, 2012 at 11:59 am
I'm not used to city, state, Zip tables being something that has to be imported repeatedly. That data doesn't change frequently.
But if I had to, I'd set up the imports to be sequential. That handles your FKs for you. Import states first, then cities (since FKs to states will be in the table already), and so on.
On the structure of the Excel files, it should parallel your tables as closely as possible. However, I've never yet run into a situation where an Excel data source was properly normalized, like what you're talking about. I'm accustomed to importing raw to a staging table, then distributing to normalized tables from there. Not the only way to do it, but it's worked well for me.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 14, 2012 at 12:13 pm
I am just using city, state and postal code example because it is something that everyone can relate to. Let me simplify it a bit further. The requirement is one spreadsheet per table.
So, if I have this
States
SD
MN
IA
Cities
Mitchell (is in the state of SD)
St. Paul
Mitchell (is in the state of IA)
What is the best way for the user to indicate in the spreadsheets that the first Mitchell belongs to SD and the second Mitchell belongs to IA? Know that these relationships get more difficult with other areas of data.
Then we have
Postal Codes
57111 - goes with Mitchell, IA
95632 - goes with Mitchell, SD
87542 - goes with Saint Paul
HOw would you accomplish this with temp tables? it seems to me that somehow the user has to tell us what city, state, postal code combinations are valid.
Im open to any ideas at all that will solve my problem.
February 14, 2012 at 12:19 pm
The spreadsheet will need to contain the parent object for each row. Cities will have to have states with them, and so on. No way around that.
Of course, with this example, one spreadsheet would do the whole job. But I'm assuming the real data is more complex.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 14, 2012 at 1:32 pm
Yes, any time a piece of data could be the child of more than one parent, you have to have an indicator of which parent it goes to.
Ideally, getting just zipcode would be all you need.
Populating from teh lowest level on up is the way to go.
February 14, 2012 at 1:51 pm
herladygeekedness (2/14/2012)
Yes, any time a piece of data could be the child of more than one parent, you have to have an indicator of which parent it goes to.Ideally, getting just zipcode would be all you need.
Populating from teh lowest level on up is the way to go.
By "lowest", do you mean Zip, or state?
I ask because I'd normally think of state as highest and Zip as lowest, as per a usual hierarchy tree. And you'd need to have states before Zips, in order to have FKs from top to bottom.
If you mean state as lowest, then yes, that's the way to go.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 14, 2012 at 2:16 pm
I mean zip as lowest. The smallest granularity. The largest population.
States have many cities, cities can have one or many zip codes but a zip code should have a unique city/state combo assigned to it. There is nothing lower than a zip code except the "plus 4" which I left out of the conversation.
So I would pop zips first because they are the grandchildren in this case and I generally go from teh Many- side first.
February 14, 2012 at 3:02 pm
Thanks so much to both of you for your assistance. I very much appreciate it.
February 15, 2012 at 8:09 am
herladygeekedness (2/14/2012)
I mean zip as lowest. The smallest granularity. The largest population.States have many cities, cities can have one or many zip codes but a zip code should have a unique city/state combo assigned to it. There is nothing lower than a zip code except the "plus 4" which I left out of the conversation.
So I would pop zips first because they are the grandchildren in this case and I generally go from teh Many- side first.
Except you can't do that if you need data in a parent table for a foreign key to reference. You can't insert a Zip in New York city if you haven't already put New York in the cities table, and you can't add the city if you don't have the state in that table. Not without violating referential integrity.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
February 15, 2012 at 9:56 am
ooop, sorry. we don't do incremental, we wipe daily.
February 15, 2012 at 10:31 pm
All of that wonderment goes to hell in a handbasket if someone has the spreadsheet open at run time. Wouldn't it be better and simpler to provide the user with a spreadsheet that had a macro button on it that said "When you're done, click this button to export".
--Jeff Moden
Change is inevitable... Change for the better is not.
February 16, 2012 at 6:52 am
Jeff Moden (2/15/2012)
All of that wonderment goes to hell in a handbasket if someone has the spreadsheet open at run time. Wouldn't it be better and simpler to provide the user with a spreadsheet that had a macro button on it that said "When you're done, click this button to export".
Yep.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Viewing 12 posts - 1 through 12 (of 12 total)
You must be logged in to reply to this topic. Login to reply