Code deployment minimizing "down time"

  • We are currently trying to improve "down time" with code deployments. The approach we are currently looking into is:

    Server A is our live production database server.

    Server B is our warm DR box kept up to date using "Log Shipping" (10 minute).

    Log shipping is stopped and code deployment starts at 10:00 going to Server B.

    When satisfied that all is well, we switch to Server B from A say at 10:30.

    Transactions that occurred between 10:00 and 10:30 are now missing from Server B.

    Code deploy to Server A at 10:45, which finishes at 11:00.

    Switch back to Server A at 11:00.

    Transactions that occurred between 10:30 and 11:00 on Server B don't exist on Server A.

    Can you see where this is going?

    We are looking at data comparison tools like Redgate's Compare Bundle but are there any standard approaches to achieving this without the use of 3rd party tools that I don't know about?

    Does this make sense?



  • Personally I use Box C as a QA box, backup/restore box A to box C, deploy code, and test. If everything is ok, then deploy code to box A and box B using automation.

    My process is described at:

    Steve Jones

  • I agree that deployments need to be tested in a PRE-LIVE environment which we do here. This environment is identical to our LIVE environment. This is not the issue though, the issue I am faced with is having a deployment with no "down time". If there is an element of "down time" this needs to be kept to a minimum.

    This is why we will switch to a secondry server and back again after deployment but the issue is getting the data back in line as both servers used in the switch process will be down at some point during the deployment so both servers will be missing data although the merged data from both servers will be complete.

    Let me know if that doesn't make sense?


  • Are you making db struct changes or just changed to stored procs? If it is the latter I would test on B then just push the new stored procs to A put a In read only or sync up A and B shunt to B update A switch back to A and push all updated records from B back to A. Maybe if you can describe your problem in more detail we can make a workabe solution for you.


  • Right, if we take an example of lets say:

    1. We have an orders system. Orders are keyed in by 500 concurrent order entry clerks.

    2. There are many changes, and these changes need to take the database off-line while they are applied.

    3. Now every minute the site is down, the business is losing up to 50 orders a minute.

    4. Management have calculated that scheduled "down time" is costing 1000's a week and have put in a requirement where "down time" must be kept to no more than 30 minutes a week and with a system still been developed on, we all know how unlikely this may seem.

    So, how do we get these objects deployed without "down time" or minimal "down time"?

    As a proposal, we use the failover server as a deployment mechanism by doing the following:

    Server A is our live production server.

    Server B is our failover server.

    Log shipping keeps the failover server in line (10 minute intervals).

    At 10:00, after the most recent transaction log is shipped, log shipping is stopped and the changes mentioned in 2 are deployed to the failover server. This deployment takes 15 minutes. Assuming everything is okay, DSN's are changed to direct users to the failover server (Server B) and this occurrs at 10:15. Note that since 10:00 for 15 minutes orders have been entered on Server A. Now orders are keyed in on the upgraded Server B while deployment of code takes place on Server A. This deployment finishes at 10:30 and DSN's are switched back to Server A. Now we have a situation where 15 minutes worth of orders have been keyed in on Server A (10:00 - 10:15) and 15 minutes on Server B (10:15 to 10:30).

    My question is how can this 30 minutes of orders entered on 2 different servers be merged? The end result however is the desired result as the only "down time" experienced by entry clerks was when DSN's were switched. This also exposes a risk because while the data is out of date, log shipping can't be restarted until the data has been brought back in line. In effect, the 15 minutes keyed in on Server B needs to be brought back to Server A. Once this is done, then log shipping can start again and our failover server is back up to date and can be used for disaster recovery if needed.

    I have come up with a few ways to achieve this but all of them need to be controlled manually and we need to look at ways of automating this.

    Any ideas will be greatly appreciated.



  • I will try to come up with some solutions for you this week.


  • Thanks and any suggestions will be greatly appreciated.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply