Data Warehouse Disaster Revcovery Options

  • I'm in the process of implementing a Data Warehouse.

    I have a Staging Database and a DatawArehouse Database.

    I have data feeds from DB2, Oracle and SQL Server.

    The DB2 feed is very large, from a remote location and the bandwidth is slow.

    I have the Staging Database set to Simple Recovery primarily because I want to minimize the load time to fit within a narrow window.

    I currently have the Data Warehouse Database set to simple recovery model.

    It was suggested that I use log shipping as a Disaster Recovery methodology but for obvious reasons that is not going to work.

    I was considering Mirroring and Snapshot Replication but it is my understanding that Replication was not intended to be a DR Solution?

    Any thoughts on this?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • You say Log Shipping is not going to work for obvious reasons (i.e. because you are in Simple). But I don't understandi your reasons for being in Simple in the first place. You said it was to "reduce load time". Please explain.

  • It takes a long time to load the Data and I can barely complete the load within a narrow window.

    I save time by not logging the Inserts, etc.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Inserts, etc are still logged in Simple mode. They're just removed more quickly. Have you tried switching to Full recovery mode and comparing the time taken?

    John

  • That right, simple recovery truncates the transaction log on the checkpoint.

    Thanks, I will switch recovery mode and perform comparisons.

    So if I'm using full recovery model is log shipping the preferred DR Strategy?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • We need a bit more information, please. Which database(s) will your DR strategy cover? I assume your Staging database is updated from the external sources and then your Data Warehouse database is updated from the Staging database? How often does that happen?

    I have found that Database Mirroring is the simplest, cleanest solution, but it won't be suitable for all purposes. If you just have one overnight update to your databases, for example, then you might consider a backup and restore instead, and build that into your batch process.

    As far as putting yoour databases into full recovery mode is concerned, make sure your log files are large enough for the frequency of your log backups. If your log files need to grow during the load then any comparison with Simple mode may not be very helpful.

    John

  • John Mitchell-245523 (11/12/2012)


    We need a bit more information, please. Which database(s) will your DR strategy cover? I assume your Staging database is updated from the external sources and then your Data Warehouse database is updated from the Staging database? How often does that happen?

    I have found that Database Mirroring is the simplest, cleanest solution, but it won't be suitable for all purposes. If you just have one overnight update to your databases, for example, then you might consider a backup and restore instead, and build that into your batch process.

    As far as putting yoour databases into full recovery mode is concerned, make sure your log files are large enough for the frequency of your log backups. If your log files need to grow during the load then any comparison with Simple mode may not be very helpful.

    John

    I have Data Feeds from IBM DB2, Oracle, and SQL Server and for each data source I have a number of loads to staging.

    The Data Warehouse is finally loaded from Staging.

    I have written DTS & SSIS Packages to backup the Database, Compress it, Copy it to another Server, unzip it and finally restore the Databases.

    How about snapshot replication?

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • If you want to copy the whole database, backup and restore is much cleaner. I'd only recommend snapshot replication if there are only certain tables that you want to copy across to DR. As for your staging database, it sounds as if it's getting updated several times a day, so you'll probably want to look at log shipping or mirroring - or transactional replication if you don't want to copy everything. Given that staging databases are often, by their very nature, entirely transient, it may be sufficient just to keep an empty copy of it on your DR server.

    John

  • Isn't log shipping and mirroring high availability? I thought having plenty of verified backups is disaster recovery.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • It's probably just a question of semantics. As I understand it, a high availability server can form part of a DR strategy. I don't think it matters whether the server is referred to as DR or HA as long as the process is properly documented and tested.

    John

  • You skipped Step 1. Please define your RTO and RPO, then you can start to evaluate your options. If your RPO is 24 hours then Simple Recovery Model with nightly FULL backups could be enough. If your RTO is 30 minutes then FULL recovery model with Mirroring or Log Shipping will probably make more sense.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • RTO, RPO?:unsure:

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • RTO > Recovery Time Objective: the duration of time and a service level within which a business process must be restored after a disaster (or disruption) in order to avoid unacceptable consequences associated with a break in business continuity, i.e. how long can the system be unavailable while databases are being recovered.

    RPO > Recovery Point Objective: the maximum tolerable period in which data might be lost from an IT service due to a major incident, i.e. how many seconds worth of data can the business tolerate losing.

    Business Continuity implies the resumption of normal operations, however inevitably it means the prospect of the business itself surviving the disaster. You can Bing or Google to dig up some current stats about how many business completely go under after they lose a non-trivial amount of their proprietary business data.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply