Continuous Database Deployment with Confidence

To achieve gains from Database Lifecycle Management in general, and in Continuous Deployment in particular, it pays to aim for lightweight but consistent techniques and processes, refining them iteratively in the light of experience. Edward Elliott describes a Test-driven development process for developing SQL Server databases that worked for CD, based on a series of steps. Though the tools for CD may be necessary, they are not sufficient, without an effective structured process.

Overview

Deploying many changes a day to a SQL Server database sounds hard; but as long as you have the tools and processes in place, and you understand the steps involved, then it is straightforward. In this article, I would like to show one example of the way that a simple and informal development workflow can allow us to measure the deployment time for database changes in terms of hours rather than in days, weeks, months or in not unheard of circumstances, years.

What follows is a development process that I have actually used with a team of developers, some of whom were completely new to Continuous Deployment (CD) for databases. It is not the intention of this article to give you the scripts to implement continuous deployment: Instead I want to show that it is the actual development process that you adopt that aids continuous deployment rather than the use of one or more specific tool. Tools may be necessary but they aren’t sufficient: you have to have a robust process in place.

Assumptions

This process that I’m illustrating works for a development team who have an in-house database that they develop themselves. It doesn’t really work for ISVs who do not manage the customer databases because the development team cannot then control how, or when, an external agency applies changes to the database.

The iterative stages of a deployment

Step 1 – “Please make a change” – Time 15 minutes, elapsed time 15 minutes.

The first step is to be given a change to make, this could be a feature request or a bug to fix. At the point when you accept some work, you should try and break it down into a series of changes, each of which requiring as small a deployable piece of code as possible, and look at using ‘feature switching’. Feature switching is a technique that allows you to switch code in and out of a production application for all or some of your users. By making a series of several smaller simpler incremental changes rather than a single large change, it is possible to reduce the overall time to get to the end goal rather than trying to “boil the ocean” by getting everything done in one big release. If the change requires many releases but the overall functionality should not be available immediately until subsequent releases, then you can deploy the code but leave it disabled via feature switching until it is time to enable it.

You can usually make changes much more safely by deploying little and often. This is because it is considerably easier to test and manage a small change, along with its ramifications. With databases, the ramifications of change seem to increase logarithmically with complexity, and so it makes sense to avoid a release that involves many files and hundreds of changes lines of code.

In order to identify a change to make, and assign it to a developer, you might use a variety of tools: It could be anything from a work-item-tracking solution such as Jira or VSTS all the way to whiteboard and post-it notes. It doesn’t really matter what you choose as long as the developers know what they should be working on, and what to do if they get stuck.

Step 2 – Investigate and discuss – Time 1 hour, elapsed time 1 hour 15 minutes

The second step is to investigate the work and see what needs to change, and what possible side effects there might be. Once you have identified the change that you are likely to need to make, you should then discuss the change with other developers and testers. These discussions should help to identify what consequences, both good and bad, that the change will have. They will also help you to check whether the way that you are planning to approach the change is correct. I often find that at this point, it becomes clear that the change can cause serious other problems or serious side effects that make it much more challenging than originally anyone considered.

This can lead to feedback discussions with the product owner about whether to go ahead with the change or to drastically alter the outcome of the change. If a change is stopped at this point then it is great for the business: It is far more expensive to the business if they have to wait until something has been deployed to find out that a small change has broken some user’s workflow.

If the work goes ahead, then the development team need to decide what tests are required. If they decide that the code is already well covered by unit and integration tests then that might be enough, but if it is an area where there are not many tests, perhaps this should be used as an opportunity to increase code coverage in that particular area.

Step 3 – Create branch in git – Time 10 minutes, elapsed time 1 hour, 25 minutes

The developer needs to create an environment to work in. Git is becoming more and more popular and if you are an active development team then I suggest that you should have either started using git, be considering it, or have very specific requirements which preclude it.

The developer creates a branch from the main deployment branch and gets the latest code locally. Once you have the code locally, you should build and deploy it to your local developer’s ‘private’ instance.

To deploy to your local private instance, there should be scripts that take your individual scripts from source control and automatically deploy them to create your version of the database. You should also have scripts to seed your test data so you can drop your database, deploy the code and then deploy any test data you might want.

It takes a little time to get the scripts together so that you can deploy locally with tests data. It is time well-spent because every single time that you start working on a change, you are benefitting from that upfront work. Getting your local database up to date with the version in production in terms of the metadata is critically important to making sure that changes are made with the knowledge of how production looks – anything else is reckless and dangerous.

Step 4- Write unit tests – Time 2 hours, elapsed time 3 hours, 25 minutes

In this step, we must write the unit tests. The time it takes to write the tests is completely dependent on what the change is, how many parts of the system it affects and how well covered by tests an area already is.

For example, if you are making a simple change in a procedure that is already well covered by tests, you can probably just change an existing test or write a new one to adequately test the change. If you find that there are not any other tests for the part of the code where you will be making a change, then it will take longer to write unit tests.

It is important to remember that the time it takes to initially write tests is a one-off that will pay dividends later. It will take a great deal of effort to write tests for the first small change, but every other change after that will then benefit from the work. If the code is going to be changed in the future, then this upfront effort is never wasted.

One wonderful side-effect of devising the tests is that the developer should have a good idea about what the change will involve. Much of the work writing unit tests is about understanding the change and what effects it will have and making sure the tests adequately cover the changes.

Another positive side-effect of this exercise is that can clarify the repercussions of the change to the extent that the developer needs to go back to the product owner or other team members for clarification and it is not uncommon that changes are stopped or suspended at this point.

Step 5 – Write integration tests – Time 30 minutes, elapsed time 3 hours, 55 minutes

The task of writing integration tests should be a matter of setting up some data, running the application code and then verifying the data at the end. It can involve quite a lot of code to get to the point at which this task of writing integration tests is simple, but it should be sharable between tests so again, you gain long-term. Once you have made the initial up-front investment in the testing framework you can quickly extend the range of tests, thereby making the whole application easier to change and to have confidence in.

Step 6- Write code, Time 10 minutes, elapsed time , 4 hours

The time it takes to write the code will obviously depend on the nature of the change. A large refactoring task that changes hundreds of objects will take a long time, whereas most changes should be small and simple: Because you have spent the time writing tests and understanding the change, the actual code should be quick to write.

In many ways, if you have a thorough development process, and a stringent developer testing strategy, then the job of writing the actual code is the least important part of the process. The test code, and the interaction with other developers and the product owner, are the critical parts of the process. If the code doesn’t get written correctly, then it won’t be deployed to production.

Step 7- Merge latest production code, Time 5 minutes, elapsed time 4 hours, 5 minutes

Since you last pulled the code from production version of the database project over four hours ago, there may have been quite a lot of changes by other team members done in parallel, so you need to re-download the main branch and merge it into your changes. If you have kept your change small or you have successfully partitioned the database into logically-separate schemas, then the merge should be simple: if you have changed the same files as five other people, or all worked on the same schema, then the merge will, most likely, be harder.

Step 8 – Compile code, deploy locally, run tests – Time 15 minutes, elapsed time 4 hours 20 minutes

It is at this stage that you can prove to yourself that the code works with the latest production changes and that you are ready to push the code. When running the tests, you will probably have just been running the tests that cover processes affected by this change. This is an opportunity to make sure that you haven’t inadvertently broken something you didn’t expect. If you have a low coverage of the system, then this can take much longer. You may have some manual steps to perform to test the system, and so it can take much longer.

To gain confidence that your change will not break anything you need to have high test coverage: The tests need to complete quickly and should only fail when there is an error. A test suite that fails randomly should not ever be trusted.

Depending on the type of application and how involved the product owner is, you will also be likely to demo the changes to the product owner. This is more likely with a feature that changes functionality rather than a bugfix, but it can be done at this point.

Step 9 – Push changes to remote branch, 5 minutes, elapsed time 4 hours and 25 minutes

In this step, you take your local changes and push then to the source repository. If there have been changes since you last pulled, then you will need to go back to step 6 and repeat. The longer that each step takes and the larger the changes made by everyone, the harder it is to complete your work; so it is in everyone’s best interest to make lots of small changes rather than fewer larger “merge hell” ones.

Step 10- Build server builds, tests and reports status of change – Time 20 minutes, elapsed time 4 hours and 45 minutes

The build server should monitor for changes and then check out the code, compile, deploy to a test database and run every test that it can. It must be easy to see the results of the build and to investigate any build failures. While the build runs, it is often a good time to update any documentation or go and grab that coffee that the developer has been postponing all morning.

Step 11 – Create a pull request, Time 5 minutes, elapsed time 4 hours and 50 minutes

This involves creating a pull request to the main development branch and finding someone to review the code. Part of a good development process is to have a good definition of the word ‘done’. It is a good chance to verify that your change looks as you want it to look, and that work is ‘done’ by the strict definition of that word that you’ve all adopted.

The definition of done is likely to include items such as code standards, test coverage and documentation. Without a definition of ‘done’ that everyone in the team subscribes to, you will find that changes are pushed through incomplete, thereby risking the entire system.

Step 12 – Pull request is code-reviewed, Time 20 minutes – elapsed time 5 hours, 10 minutes

If the change has been small, then the code review should be simple – the reviewer can look at the ticket, the change and discuss with the developer the reason they took the approach that they did. It is simple to review small code changes. There is a typical danger that, when a code review involves many files and changed lines of code, it is waved through without sufficient attention because it is so much harder to do. By keeping your changes small, not only will the work be easier but the code review will prove to be more likely to find bugs and issues before hitting production.

The reviewer should verify that the code is complete using the definition of ‘done’: If not, it should be rejected.

Step 13 – Pull request is merged to master and deployed, Time 15 minutes – elapsed time 5 hours and 25 minutes

This step is part of the development process but out of the developer hands. He or she will likely keep an eye on the deployments progress in case there are any issues, but generally the developer should look to see if there is time in the day to start working on another ticket.

Summary

It can seem as if a DLM Continuous Deployment process like the one I’ve described would be out of reach for many development teams, but if you make small in-roads in every piece of work you do, you will soon find that you are able to deploy little and often (very often) on a regular basis. This has the benefits of getting work into production quicker but it also has extra benefits such as more reliable code due to the code being reviewed thoroughly and tests covering the area of functionality affected.

Can things go wrong? Yes of course, but you are likely to greatly improve your chances of getting it right. Things are just more likely to go wrong when you do not write tests or if you manually gather your deployment scripts. At least, by using this approach you can fix any issues using a defined process.