SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


How to store a central repository of data


How to store a central repository of data

Author
Message
sbaer
sbaer
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 30
I am fairly new to database design and haven't been able to find a lot on this question.

I work for a marketing and advertising firm. We don't have a "transaction" database since all of our data comes from our clients. I'm responsible to load it all into a database so we can remove duplicates, clean the data, and run analytics.

We normally create a separate database for each client for security reasons (decision made by IT security). But most of these databases has some things in common, mostly address data.

I want a central repository of things like zip code data (zip codes linked to county, regions, etc.), country codes, state codes, etc. Usually, this repository would be used just for the initial insert and cleanup. I'd have to enter it into the client's database the way the client wants to see it.

Any advice, thoughts, or references on how this is best implemented? Are there problems associated with linking data like this between databases? Is there an easier way to keep a single zip code table but have it be accessible to several address tables?
Eirikur Eiriksson
Eirikur Eiriksson
SSC-Dedicated
SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)

Group: General Forum Members
Points: 39097 Visits: 19432
sbaer (5/6/2014)
I am fairly new to database design and haven't been able to find a lot on this question.

I work for a marketing and advertising firm. We don't have a "transaction" database since all of our data comes from our clients. I'm responsible to load it all into a database so we can remove duplicates, clean the data, and run analytics.

We normally create a separate database for each client for security reasons (decision made by IT security). But most of these databases has some things in common, mostly address data.

Have a talk with the security guys and find out exactly what the requirements are, what you must comply to etc. A mutual understanding on this is imperative!


I want a central repository of things like zip code data (zip codes linked to county, regions, etc.), country codes, state codes, etc. Usually, this repository would be used just for the initial insert and cleanup. I'd have to enter it into the client's database the way the client wants to see it.


If possible, establish a single data collection point. From there it is easy to distribute the client's data to the appropriate destinations. In the central repository, maintain all the the referential data that is NOT client specific.

Any advice, thoughts, or references on how this is best implemented? Are there problems associated with linking data like this between databases? Is there an easier way to keep a single zip code table but have it be accessible to several address tables?


All reporting repositories should be client specific!

Hopes this helps, feel free to inquire further.
Cool
sbaer
sbaer
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 30
If possible, establish a single data collection point. From there it is easy to distribute the client's data to the appropriate destinations. In the central repository, maintain all the the referential data that is NOT client specific.


Would this be a correct understanding of the above:

All of my "central repository" is stored in a single database called "Staging". As data is received, it flows through tables in Staging before getting to the final destination of the customer specific database. While in Staging, I can link to any of the central repository necessary, transforming data fields as necessary for the specifics of each customer. Once all of the cleanup and look up is complete, the data can be transferred to the customer specific database with the values required by the customer.

It would mean Staging would be a very large database with several schemas, stored procedures, and possibly hundreds of tables. But the data should reside there only temporarily as it is transformed so the physical size of the database would never be very large. I think the security team would buy off on this considering the data is removed to a more "secure" location once it is cleaned.

If I've missed anything, let me know. Otherwise, this seems like a logical and simple solution.
Eirikur Eiriksson
Eirikur Eiriksson
SSC-Dedicated
SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)SSC-Dedicated (39K reputation)

Group: General Forum Members
Points: 39097 Visits: 19432
sbaer (5/6/2014)
If possible, establish a single data collection point. From there it is easy to distribute the client's data to the appropriate destinations. In the central repository, maintain all the the referential data that is NOT client specific.


Would this be a correct understanding of the above:

All of my "central repository" is stored in a single database called "Staging". As data is received, it flows through tables in Staging before getting to the final destination of the customer specific database. While in Staging, I can link to any of the central repository necessary, transforming data fields as necessary for the specifics of each customer. Once all of the cleanup and look up is complete, the data can be transferred to the customer specific database with the values required by the customer.

It would mean Staging would be a very large database with several schemas, stored procedures, and possibly hundreds of tables. But the data should reside there only temporarily as it is transformed so the physical size of the database would never be very large. I think the security team would buy off on this considering the data is removed to a more "secure" location once it is cleaned.

If I've missed anything, let me know. Otherwise, this seems like a logical and simple solution.


The staging would not be too large, the data would be removed from one import to another. It will need an amble breathing space but there should be no overall growth.

Several hundreds or even thousands of tables and procedures are no problem, establish a good logical separation via client_schema_object naming convention.

Otherwise, I think you got it!
Cool
tomislav-b
tomislav-b
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 0
sbaer (5/6/2014)
I am fairly new to database design and haven't been able to find a lot on this question.

I work for a marketing and advertising firm. We don't have a "transaction" database since all of our data comes from our clients. I'm responsible to load it all into a database so we can remove duplicates, clean the data, and run analytics.

We normally create a separate database for each client for security reasons (decision made by IT security). But most of these databases has some things in common, mostly address data.

I want a central repository of things like zip code data (zip codes linked to county, regions, etc.), country codes, state codes, etc. Usually, this repository would be used just for the initial insert and cleanup. I'd have to enter it into the client's database the way the client wants to see it.

Any advice, thoughts, or references on how this is best implemented? Are there problems associated with linking data like this between databases? Is there an easier way to keep a single zip code table but have it be accessible to several address tables?


I had the same problem and after long searching I found this software.
It saved me a lot of time...
sbaer
sbaer
Forum Newbie
Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)Forum Newbie (4 reputation)

Group: General Forum Members
Points: 4 Visits: 30
Thanks for the reply. I actually have a purchased zip code database. Just having the data doesn't mean you have it in the best place. What I needed was to establish a process where that data could be used by several databases, run against several different kinds of data coming in, and have it standardized so I didn't have to load it into 10 different places just to have access to it.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search