Log in
::
Register
::
Not logged in
Home
Tags
Articles
Editorials
Stairways
Forums
Scripts
Videos
Blogs
QotD
Books
Ask SSC
SQL Jobs
Training
Authors
About us
Contact us
Newsletters
Write for us
Recent Posts
Recent Posts
Popular Topics
Popular Topics
Home
Search
Members
Calendar
Who's On
Home
»
Database Design
»
Design Ideas and Questions
»
Separate Staging Tables and Data Warehouse...
35 posts, Page 1 of 4
1
2
3
4
»
»»
Separate Staging Tables and Data Warehouse into different Databases.
Rate Topic
Display Mode
Topic Options
Author
Message
Welsh Corgi
Welsh Corgi
Posted Friday, March 23, 2012 11:10 AM
Hall of Fame
Group: General Forum Members
Last Login: Today @ 11:42 AM
Points: 3,827,
Visits: 4,047
Until today I never saw a Database that contained both the Staging Tables and Data Warehouse Tables in the same Database.
The Staging Tables are in a different schema than the Data Warehouse Tables.
Maybe I'm missing something but I have to make a solid case that this is a good practice.
Any information would be greatly appreciated.
http://www.bidn.com/blogs/TomLannen/bidn-blog/1277/separating-data-warehouse-tables-from-your-relational-and-staging-tables
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1271897
Lynn Pettis
Lynn Pettis
Posted Friday, March 23, 2012 11:31 AM
SSC-Insane
Group: General Forum Members
Last Login: Today @ 2:17 PM
Points: 21,594,
Visits: 27,406
Can't really say if it is good or bad, just different. When we started a data warehouse project at a previous employer we set up the Staging and ODS databases separately. We could have just as easily put them in the same the database using schemas to keep things separate.
If this is being done, one thing I would also make sure of is that the staging and data warehouse tables were in different file groups. This would allow for the placing of these groups on different spindles on the server (or SAN) so that they weren't using the same disks.
Lynn Pettis
For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here
or
when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here
and
here
Managing Transaction Logs
SQL Musings from the Desert
Fountain Valley SQL
(My Mirror Blog)
Post #1271919
opc.three
opc.three
Posted Friday, March 23, 2012 11:53 AM
SSCertifiable
Group: General Forum Members
Last Login: Today @ 1:28 PM
Points: 6,698,
Visits: 11,726
What I often see are separate databases on the same instance, where the staging database is in simple recovery mode and the data warehouse is in full recovery mode. Procs parked in the staging database use cross-database calls to load the data into the data warehouse database after it has been massaged and prepared in tables in the staging database. Different schemas in the same database do not give you the separation necessary to run different recovery modes.
The big barrier to moving the staging database onto a different instance altogether is the additional overhead of getting that data from the staging database into the data warehouse. Consider a simple cross-database INSERT...SELECT versus a secondary push job using SSIS or some other means. It's a tradeoff when having your staging database on a separate instance between flexibility in terms of scaling your environment and ease of loading your data warehouse.
__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Believe you can and you're halfway there.
--Theodore Roosevelt
Everything Should Be Made as Simple as Possible, But Not Simpler
--Albert Einstein
The significant problems we face cannot be solved at the same level of thinking we were at when we created them.
--Albert Einstein
1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples.
--Giordy
Post #1271942
Welsh Corgi
Welsh Corgi
Posted Friday, March 23, 2012 11:56 AM
Hall of Fame
Group: General Forum Members
Last Login: Today @ 11:42 AM
Points: 3,827,
Visits: 4,047
Thanks you very much for the information!
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1271948
Lynn Pettis
Lynn Pettis
Posted Friday, March 23, 2012 11:59 AM
SSC-Insane
Group: General Forum Members
Last Login: Today @ 2:17 PM
Points: 21,594,
Visits: 27,406
opc.three (3/23/2012)
What I often see are separate databases on the same instance, where the staging database is in simple recovery mode and the data warehouse is in full recovery mode. Procs parked in the staging database use cross-database calls to load the data into the data warehouse database after it has been massaged and prepared in tables in the staging database. Different schemas in the same database do not give you the separation necessary to run different recovery modes.
The big barrier to moving the staging database onto a different instance altogether is the additional overhead of getting that data from the staging database into the data warehouse. Consider a simple cross-database INSERT...SELECT versus a secondary push job using SSIS or some other means. It's a tradeoff when having your staging database on a separate instance between flexibility in terms of scaling your environment and ease of loading your data warehouse.
I understand the Staging database using the simple recovery model, but curious why the data warehouse is using the full recovery model? Is it due to size and amount of data loaded each time that t-log backups are considered better?
Previous company I worked at that was building a dat warehouse had theirs using the simple recovery model. A full backup was run after the nightly load process was completed.
Lynn Pettis
For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here
or
when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here
and
here
Managing Transaction Logs
SQL Musings from the Desert
Fountain Valley SQL
(My Mirror Blog)
Post #1271951
opc.three
opc.three
Posted Friday, March 23, 2012 12:14 PM
SSCertifiable
Group: General Forum Members
Last Login: Today @ 1:28 PM
Points: 6,698,
Visits: 11,726
Lynn Pettis (3/23/2012)
opc.three (3/23/2012)
What I often see are separate databases on the same instance, where the staging database is in simple recovery mode and the data warehouse is in full recovery mode. Procs parked in the staging database use cross-database calls to load the data into the data warehouse database after it has been massaged and prepared in tables in the staging database. Different schemas in the same database do not give you the separation necessary to run different recovery modes.
The big barrier to moving the staging database onto a different instance altogether is the additional overhead of getting that data from the staging database into the data warehouse. Consider a simple cross-database INSERT...SELECT versus a secondary push job using SSIS or some other means. It's a tradeoff when having your staging database on a separate instance between flexibility in terms of scaling your environment and ease of loading your data warehouse.
I understand the Staging database using the simple recovery model, but curious why the data warehouse is using the full recovery model? Is it due to size and amount of data loaded each time that t-log backups are considered better?
Previous company I worked at that was building a dat warehouse had theirs using the simple recovery model. A full backup was run after the nightly load process was completed.
Generically, a staging database where tables are wiped and reloaded with each new incoming feed makes it volatile, i.e. there is little to no value in recovering a previous DB. Whereas to recover a DW to minimize downtime for data clients could be important.
In the first case that comes to mind running the DW in FULL was purely a DR position. Standard operating procedure was for all mission-critical databases, for which the DW database qualified, to be mirrored. As you know mirroring requires the database to be in FULL recovery mode 100% of the time. Staging was classified as volatile so the DR plan for that DB was to restore the last full backup to the secondary server and resume the job schedule. With mirroring, the DW could be brought online by simply failing over to the secondary mirror. Overkill? Maybe, but the shop had a very "one size fits all" mentality.
I could see value in log shipping a DW though, or in being able to restore to a PIT for purposes of isolating a production issue. Are those worth the overhead of running a DW in FULL? Not sure, depends on the shop and their SLAs I guess.
__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Believe you can and you're halfway there.
--Theodore Roosevelt
Everything Should Be Made as Simple as Possible, But Not Simpler
--Albert Einstein
The significant problems we face cannot be solved at the same level of thinking we were at when we created them.
--Albert Einstein
1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples.
--Giordy
Post #1271962
Lynn Pettis
Lynn Pettis
Posted Friday, March 23, 2012 12:18 PM
SSC-Insane
Group: General Forum Members
Last Login: Today @ 2:17 PM
Points: 21,594,
Visits: 27,406
opc.three (3/23/2012)
Lynn Pettis (3/23/2012)
opc.three (3/23/2012)
What I often see are separate databases on the same instance, where the staging database is in simple recovery mode and the data warehouse is in full recovery mode. Procs parked in the staging database use cross-database calls to load the data into the data warehouse database after it has been massaged and prepared in tables in the staging database. Different schemas in the same database do not give you the separation necessary to run different recovery modes.
The big barrier to moving the staging database onto a different instance altogether is the additional overhead of getting that data from the staging database into the data warehouse. Consider a simple cross-database INSERT...SELECT versus a secondary push job using SSIS or some other means. It's a tradeoff when having your staging database on a separate instance between flexibility in terms of scaling your environment and ease of loading your data warehouse.
I understand the Staging database using the simple recovery model, but curious why the data warehouse is using the full recovery model? Is it due to size and amount of data loaded each time that t-log backups are considered better?
Previous company I worked at that was building a dat warehouse had theirs using the simple recovery model. A full backup was run after the nightly load process was completed.
Generically, a staging database where tables are wiped and reloaded with each new incoming feed makes it volatile, i.e. there is little to no value in recovering a previous DB. Whereas to recover a DW to minimize downtime for data clients could be important.
In the first case that comes to mind running the DW in FULL was purely a DR position. Standard operating procedure was for all mission-critical databases, for which the DW database qualified, to be mirrored. As you know mirroring requires the database to be in FULL recovery mode 100% of the time. Staging was classified as volatile so the DR plan for that DB was to restore the last full backup to the secondary server and resume the job schedule. With mirroring, the DW could be brought online by simply failing over to the secondary mirror. Overkill? Maybe, but the shop had a very "one size fits all" mentality.
I could see value in log shipping a DW though, or in being able to restore to a PIT for purposes of isolating a production issue. Are those worth the overhead of running a DW in FULL? Not sure, depends on the shop and their SLAs I guess.
As soon as I saw DB Mirroring, it makes sense. With that I would also assume that t-log backups are run periodically or after each load to keep the t-log from filling the disk.
Lynn Pettis
For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here
or
when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here
and
here
Managing Transaction Logs
SQL Musings from the Desert
Fountain Valley SQL
(My Mirror Blog)
Post #1271964
opc.three
opc.three
Posted Friday, March 23, 2012 12:25 PM
SSCertifiable
Group: General Forum Members
Last Login: Today @ 1:28 PM
Points: 6,698,
Visits: 11,726
Lynn Pettis (3/23/2012)
As soon as I saw DB Mirroring, it makes sense. With that I would also assume that t-log backups are run periodically or after each load to keep the t-log from filling the disk.
I cannot recall specifically, but it was something like every 15 minutes during heavy load times, and every few hours during the times when the DB was being used for reporting. Some lighter data feeds were loaded throughout the day as well.
__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Believe you can and you're halfway there.
--Theodore Roosevelt
Everything Should Be Made as Simple as Possible, But Not Simpler
--Albert Einstein
The significant problems we face cannot be solved at the same level of thinking we were at when we created them.
--Albert Einstein
1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples.
--Giordy
Post #1271970
PaulB-TheOneAndOnly
PaulB-TheOneAndOnly
Posted Friday, March 23, 2012 1:24 PM
SSCrazy
Group: General Forum Members
Last Login: Today @ 1:08 PM
Points: 2,982,
Visits: 4,396
Welsh Corgi (3/23/2012)
Until today I never saw a Database that contained both the Staging Tables and Data Warehouse Tables in the same Database.
This is common practice in "some" RDBMS other than SQL Server, in most cases even in the same schema relying on naming covention to tell appart each kind of table like; stg_, dim_, fact_, ods_, etc.
_____________________________________
Pablo (Paul) Berzukov
Author of
Understanding Database Administration
available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #1272045
Welsh Corgi
Welsh Corgi
Posted Friday, March 23, 2012 2:04 PM
Hall of Fame
Group: General Forum Members
Last Login: Today @ 11:42 AM
Points: 3,827,
Visits: 4,047
PaulB-TheOneAndOnly (3/23/2012)
Welsh Corgi (3/23/2012)
Until today I never saw a Database that contained both the Staging Tables and Data Warehouse Tables in the same Database.
This is common practice in "some" RDBMS other than SQL Server, in most cases even in the same schema relying on naming covention to tell appart each kind of table like; stg_, dim_, fact_, ods_, etc.
I'm only concerned about SQL Server at this point but I remeber that in Oracle when you create a new Database you specify whether you want to create an OLTP or an OLAP Database.
Oracle configures certain system options differently depending upon whether it is an OLTP or an OLAP Database.
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/
Post #1272076
« Prev Topic
|
Next Topic »
35 posts, Page 1 of 4
1
2
3
4
»
»»
Permissions
You
cannot
post new topics.
You
cannot
post topic replies.
You
cannot
post new polls.
You
cannot
post replies to polls.
You
cannot
edit your own topics.
You
cannot
delete your own topics.
You
cannot
edit other topics.
You
cannot
delete other topics.
You
cannot
edit your own posts.
You
cannot
edit other posts.
You
cannot
delete your own posts.
You
cannot
delete other posts.
You
cannot
post events.
You
cannot
edit your own events.
You
cannot
edit other events.
You
cannot
delete your own events.
You
cannot
delete other events.
You
cannot
send private messages.
You
cannot
send emails.
You
may
read topics.
You
cannot
rate topics.
You
cannot
vote within polls.
You
cannot
upload attachments.
You
may
download attachments.
You
cannot
post HTML code.
You
cannot
edit HTML code.
You
cannot
post IFCode.
You
cannot
post JavaScript.
You
cannot
post EmotIcons.
You
cannot
post or upload images.
Copyright © 2002-2013 Simple Talk Publishing. All Rights Reserved.
Privacy Policy.
Terms of Use.
Report Abuse.