AlwaysOn DRP test - discarding the changes in the DR replica

  • We are doing a DRP test on a large database, which is hosted on an AlwaysOn availablilty group with 3 replicas (2 sync, 1 async). The plan is to failover to the asynch replica, make some tests, then discard this database and revert back to the synchronous replicas and then resynch back to the asynch replica. Because this test is a little different than an actual DR scenario, I would like to get a confirmation of the steps to failover and failback. I'd rather not have to restore the entire AG group, keeping the original 2 synchronous replicas.

    Let's call these replicas P1 (primary replica), P2 (secondary replica, synchronous), DR (secondary replica, asynchronous).

    Failover:

    Stop/disable SQL service for P2

    Stop/disable SQL service for P1

    Force failover on DR (ALTER AVAILABILITY GROUP agxxxx FORCE_FAILOVER_ALLOW_DATA_LOSS)

    -- MAKE TESTS --

    Failback:

    Remove DR from availabililty group

    Delete DR replica database

    Start/enable SQL service for P1

    Start/enable SQL service for P2

    Force failover on P1 (ALTER AVAILABILITY GROUP agxxxx FORCE_FAILOVER_ALLOW_DATA_LOSS)

    Add DR back to availabililty group, resynchronize database

    Will it work like this? I'm a bit unclear at what to expect when we failback. If the asynch replica is removed first (to prevent test data from going into prod), will the synchronous replicas resync correctly and come online when I force failover to P1?

    Thanks!

  • JarJar (10/6/2016)


    We are doing a DRP test on a large database, which is hosted on an AlwaysOn availablilty group with 3 replicas (2 sync, 1 async). The plan is to failover to the asynch replica, make some tests, then discard this database and revert back to the synchronous replicas and then resynch back to the asynch replica. Because this test is a little different than an actual DR scenario, I would like to get a confirmation of the steps to failover and failback. I'd rather not have to restore the entire AG group, keeping the original 2 synchronous replicas.

    Let's call these replicas P1 (primary replica), P2 (secondary replica, synchronous), DR (secondary replica, asynchronous).

    Failover:

    Stop/disable SQL service for P2

    Stop/disable SQL service for P1

    Force failover on DR (ALTER AVAILABILITY GROUP agxxxx FORCE_FAILOVER_ALLOW_DATA_LOSS)

    -- MAKE TESTS --

    Failback:

    Remove DR from availabililty group

    Delete DR replica database

    Start/enable SQL service for P1

    Start/enable SQL service for P2

    Force failover on P1 (ALTER AVAILABILITY GROUP agxxxx FORCE_FAILOVER_ALLOW_DATA_LOSS)

    Add DR back to availabililty group, resynchronize database

    Will it work like this? I'm a bit unclear at what to expect when we failback. If the asynch replica is removed first (to prevent test data from going into prod), will the synchronous replicas resync correctly and come online when I force failover to P1?

    Thanks!

    I don't think you will be able to bring up P1 and P2 successfully after you remove DR from AG group.

    It will probably go into the RESOLVING state since your primary was on P3.

    On P3 you can suspend data movement and failover to P1 and then delete the db from P3 and recreate it from backup\restore on P1.

    Alex S
  • JarJar (10/6/2016)


    We are doing a DRP test on a large database, which is hosted on an AlwaysOn availablilty group with 3 replicas (2 sync, 1 async). The plan is to failover to the asynch replica, make some tests, then discard this database and revert back to the synchronous replicas and then resynch back to the asynch replica. Because this test is a little different than an actual DR scenario, I would like to get a confirmation of the steps to failover and failback. I'd rather not have to restore the entire AG group, keeping the original 2 synchronous replicas.

    Let's call these replicas P1 (primary replica), P2 (secondary replica, synchronous), DR (secondary replica, asynchronous).

    Failover:

    Stop/disable SQL service for P2

    Stop/disable SQL service for P1

    Force failover on DR (ALTER AVAILABILITY GROUP agxxxx FORCE_FAILOVER_ALLOW_DATA_LOSS)

    -- MAKE TESTS --

    Failback:

    Remove DR from availabililty group

    Delete DR replica database

    Start/enable SQL service for P1

    Start/enable SQL service for P2

    Force failover on P1 (ALTER AVAILABILITY GROUP agxxxx FORCE_FAILOVER_ALLOW_DATA_LOSS)

    Add DR back to availabililty group, resynchronize database

    Will it work like this? I'm a bit unclear at what to expect when we failback. If the asynch replica is removed first (to prevent test data from going into prod), will the synchronous replicas resync correctly and come online when I force failover to P1?

    Thanks!

  • If the plan is to stop P1 and P2 then the databases on the 3rd replica will be in restoring mode. They can be brought online by using the WITH RECOVERY command.

    When you need to revert back. You can delete the databases then bring 1 and 2 back on line and resync.

  • thanks guys. fortunately, i have the opportunity to make a test with our QA system first. i'll try the plan suggested by Alex and see how it works. it seems logical to me.

  • FYI, so this is the sequence that worked for the test failover scenario (discarding any changes on DR copy)

    Failover:

    Force failover on DR

    remove P1 and P2 from availability group (only DR replica is in the group during the functional tests on DR)

    -- MAKE TESTS --

    Failback:

    Remove DR replica from availability group

    Delete DR replica database

    Recover the P1 database by restoring log

    Add the P1 database back to the availability group

    Failback to P1 (At this point the original database is available again, applications can reconnect)

    Resynchronize the database across the P2 and DR replicas

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply