Log shipping vs Replication

  • Dear Community,

    I am tasked with coming up with a High Availabilty option for our Data Warehouse server - SQL Server 2005 Standard Ed.

    The two servers will be in separate parts of the USA, so Clustering is out. Our applications don't support the failover option for Database Mirroring.

    What we want:

    - Primary server that is our Hot, production site

    - Secondary server that is warm

    - Log shipping/replication will happen after morning ETL completes

    - Secondary server will be online for ad-hoc querying

    - Failover primary to secondary will be manual

    - Execute full backups from the Secondary Server

    Any ideas on Log Shipping vs Replication?

    I don't know if executing full backups from the Secondary server is even possible with Log shipping/replication. I've found lots of articles on setting up log shipping/replication, but little out on maintaining. I.E. - I haven't seen anyone explain doing a full backup after log shipping has started.

    The idea of executing full backups from the Secondary Server comes from the limited resources at our off-site facility, where the Primary would be.

    Thanks!

    Rob

  • Based on your requirements it sounds like log shipping is more applicable, however keep in mind you'll need to change the recovery model on your database to support log shipping. If you're currently using the simple recovery model and switch to full or bulk-logged you may see increased disk space usage as a result. Taking regular log backups should keep that in check though.

    Re: backups, if you go with replication you can do backups of the standby database without jumping through any hoops. With log shipping the best way I've found to do backups of the standby DB is to take it offline and copy the MDF, NDF, and LDF files, then bring the DB back online. The downside is that you can't use a tool like SQL Backup or SQLSafe to do compressed backups.

    Also keep in mind that when using standard transactional replication your subscribers won't have any identity columns that are on the publisher. If you have to fail over to the subscriber you'll need to apply the identity columns to every table that had it on the publisher and that can get tricky.

    Kendal Van Dyke
    http://kendalvandyke.blogspot.com/[/url]

  • Hi Kendal,

    Thanks for your reply!

    Most of our Data Warehouse resides in 2 databases - currently one 300GB and one 400GB. I don't think I'd have the disk space to get away with copying the mdf, ndf, and ldfs. We do use SQLsafe and love the compression. Thanks for the warning.

    So how do people handle database backups in a Log Shipping setup? Do people ever do a "full" backup of the Primary? If there was some kind of data corruption on the Primary that had already found its way to the Secondary, how do people recover from that?

    I've been reluctant to go with Transactional Replication because 95% of the updates to the databases happen in the morning between 2AM - 6AM. I fear that would increase the execution time of my ETL processes and impact the early morning users. Any thoughts on that?

    Merge replication sounds appealing - I like the idea of being able to ship everything over when ETL is done. Although updates will really only happen on 1 of the servers. I've read that sometimes people had issues with DDL updates using Merge though.

    Thanks again!

    Rob

  • >>I've been reluctant to go with Transactional Replication because 95% of the updates to the databases happen in the morning between 2AM - 6AM. I fear that would increase the execution time of my ETL processes and impact the early morning users. Any thoughts on that?

    In transacational replication, the log reader reads the publisher transaction log and does not interfer (whatsoever) with the insertions/updations/deletions happenning on the publisher database (i.e. does not cause any blocking). Given that, I am not able to understand how Tran repl will affect your ETL execution time.

    The only other thing with transaction replication is that if your log reader is in a stopped mode and you start making huge changes, that can cause your publisher transaction log (LDF) files to grow (believe me, sometimes it can really grow huge!). So, as long as you keep log reader running and ensure you have enough free space for the Transaction log file to grow, you should be fine with this option as well.

    ...my two cents

  • Rob,

    My 2 cents...

    With replication, you can specify interval/schedule at which agents run. so, It is possible to sync your subscriber at specific time/interval if you are concerned with server performance impact during heavy ETL activity. You can backup your subscriber database just like primary, however depending on when the agent was run last, your subscriber will be behind primary. As far as identity, we have used identity columns with automatic management option in transactional replication, so it should not be a problem. The limitation with replication is maintenance complexity (e.g. adding tables, changing schema requires more efforts)

    The easier option will be mirroring or log shipping. I am not sure why you can't use mirroring. You have to change connection string with log shipping also after failover and mirroring has manual failover option that might work for you.

    Hope this helps

    np

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply