• opc.three (3/23/2012)


    What I often see are separate databases on the same instance, where the staging database is in simple recovery mode and the data warehouse is in full recovery mode. Procs parked in the staging database use cross-database calls to load the data into the data warehouse database after it has been massaged and prepared in tables in the staging database. Different schemas in the same database do not give you the separation necessary to run different recovery modes.

    The big barrier to moving the staging database onto a different instance altogether is the additional overhead of getting that data from the staging database into the data warehouse. Consider a simple cross-database INSERT...SELECT versus a secondary push job using SSIS or some other means. It's a tradeoff when having your staging database on a separate instance between flexibility in terms of scaling your environment and ease of loading your data warehouse.

    I understand the Staging database using the simple recovery model, but curious why the data warehouse is using the full recovery model? Is it due to size and amount of data loaded each time that t-log backups are considered better?

    Previous company I worked at that was building a dat warehouse had theirs using the simple recovery model. A full backup was run after the nightly load process was completed.