20 October 2020

Database DevOps for Pragmatists: A Datafaction Case Study

Ami Adler, a Software Development Manager at Datafaction explains how they used Redgate tools, and others, to introduce automation and testing into their Database DevOps processes and so achieved a faster and more reliable deployment process for their application.

Complex database development presents challenges in its management because there are so many factors to consider. Database DevOps provides several automation solutions to make deployments more consistent and auditable, encourages teams to improve existing processes in continuous cycles, and allows them to adapt their approach, in response to pragmatic business decisions.

At Datafaction, we adopted the policy of introducing a series of manageable, incremental improvements across the entire release chain, without disrupting, or distracting from, current workflows and business operations. These improvements encompass the entire application lifecycle and benefit all participants, from management and development, to QA and operations. We started with a manual process and focused on its capabilities and boundaries. Once these were understood by the teams, we took every opportunity to introduce appropriate automation and testing, using Redgate tools and others, to make the process easier and less error prone.

This ongoing process has already led the team to make significant improvements to our database development and deployment processes. For example, we have:

Eliminated discrepancies in the release – previously, inconsistencies in the state of the QA, Staging and Production databases would cause frequent deployment failures. By a process of schema comparison and reconciliation, using SQL Compare, we reduced the failure rate to virtually zero.
Automated database deployments – when database changes become candidates for release, we use SQL Change Automation to generate migration scripts that deploy the changes automatically through QA, Staging, and then Production, and so promote the current release to the new development version, safely.
Introduced database version control – we now have in version control both the object-level source for our database and the migration scripts to deploy reliably from one database version to another. The object-level source, maintained in development using SQL Source Control, gives us better visibility into development, earlier testing and checking of database changes, and has helped us start reconciling historical differences between development and pre-production.
Improved database integration testing – using SQL Change Automation we introduced more frequent integration of changes to database code and application components and improved the quality of our testing. This has given us more confidence that the release candidate meets all the requirements defined by our tests.

Perhaps the most visible manifestation of these improvements is seen on our release days. The team still habitually gather for a conference call at the start of every release. Where once these were long meetings, often a source of stress and frequent on-the-spot troubleshooting, they are now almost ‘non-events’ and feel more like a formality. There is almost no reason for a human being to be involved with our deployments, which is a mark of how far the team has progressed, even when taking small steps.

Database DevOps will be viewed and implemented differently, according to the team or organization. However, with the right tools, we have found that it has rewarded our investment with increased levels of collaboration, systematic improvement of our development processes, and more frequent, reliable, and less stressful releases.

Project Details

Describing the nature of the application, and of its development and deployment processes, at the start of the project.

The application

The application is a cloud-based (SaaS) accounting system, with a SQL Server database, for the business management industry. It was a rewrite of a legacy on-premise application and was now going through a phase of rapid evolution on the new platform, as the development teams improved and expanded the tool’s capabilities. This required significant changes both to the application code and underlying database structures (tables and stored procedures). Therefore, every new release of this application represented major changes that required around 10-20 database scripts.

An accounting system, understandably, requires a thorough system of checks and tests for all changes, to ensure, for example, that all financial calculations and statements return correct results under all conditions. Much of our testing was manual, initially, and the length of the test cycle was the single biggest factor in determining the frequency of releases. We could only do full application releases every 5 weeks, although hotfix releases were weekly, as required.

The development and deployment processes

Development is carried out by approximately 20 developers and testers, split across three teams, each of which focused on different areas of the application. The teams worked in two-week sprints, and all three teams made database changes on a single, shared development copy of the database used by the application.

The two main pre-production environments, for testing and verification processes, are QA and Staging, each of which also holds a copy of this database. At the start of the project, none of these databases were in version control, and all database changes were tracked and deployed manually. When a database change was required to support a new application feature, or a bug fix, the development team would make the change directly to the shared development database, and the team lead would be responsible for informing those in charge of deployment of the required database changes.

Deployments were manual. The person responsible would construct a migration script to apply the requested database changes, test the script, then run it on the QA and Staging databases. The day of a release was usually long, and stressful for everyone involved, due to the high likelihood of failures. Several members of the team were required to be present on a conference call at the start of any deployments to production, to help with ‘real time troubleshooting’.

The following sections summarize the main causes of these issues.

Drift between QA-Staging-Production

In theory, no changes should be made outside of our official deployment processes, but in practice ad-hoc changes were inevitable, in response to ’emergency’ fixes, or performance issues. This meant that the Production database was not the same as the Staging database, which in turn had drifted from QA. Often the differences were relatively minor, such as in indexes, but occasionally they were structural. We could never be fully confident that ‘it worked on Staging’ meant it would work on Production too.

Since promoting changes from development to QA and QA to Staging required manual scripting, it was easy to ‘miss a change’. When an object in Staging was not in the same state as in QA, scripts could fail when promoting subsequent changes to that object.

No easy way to track development changes

When the application code for a new release was checked in, the person in charge of the deployment relied on the team lead to also send a list of the associated database changes. There was no way to understand, beforehand, which objects were being refactored or why, or which changes were already tested and ready for review.

The shared development database had been built and expanded over a number of years, with developers on each of the teams making structural changes, or adding data, in different ways to support new features, and the various forms of testing they needed. It was in a very different state to the pre-production databases. Again, though, we had no easy way to know for sure which changes and differences were legitimate but ‘missed’, which of them were works in progress, or which of them were simply the result of aborted or abandoned work.

Initial Objectives and Outcomes

Our objective was a fully automated Continuous Integration (CI) and Continuous Delivery (CD) pipeline that would allow us to rapidly deliver changes in the application. Towards this goal, the development teams were working hard to expand and improve their testing coverage and to introduce automation. However, we knew that we could not achieve smoothly automated releases unless we also made substantial improvements to the way in which we tracked and deployed the required database changes.

The first objective was to gain a more reliable and consistent deployment process for the database, and therefore less stressful product releases. We also knew that progress would be slow and steady. We could not afford to “shut down” current processes, disrupting existing business workflows, to get everything exactly right, first time.

Four steps on the path to CI/CD for the database

Before we could achieve any form of rapid releasing, we identified four important steps we needed to take to improve our existing database development practices. We agreed that we should address discrepancies between the databases in the different environments, automate the deployment process, introduce database source control, and improve database integration testing.

1. Align QA-Staging-Production (SQL Compare)

Through a process of schema comparison, analysis of differences, and synchronization, we aligned the schemas of the QA, Staging and Production databases.

The first stage in removing discrepancies in the release candidate, as it moved along the deployment pipeline, was to be able to detect differences.

Immediately following a release, we used SQL Compare to get a list of schema differences between Production and Staging. For each ‘discrepancy’ that it detected, we made decisions as follows:

- - Objects that existed in Staging but not in Production – were they legitimate changes that had been ‘missed’ in previous deployments? If so, deploy them to Production. If they were legitimate but their release was delayed due to incomplete testing, leave in place. If deemed illegitimate, remove them.
  - Objects that existed in Production but not in Staging – was this “out of process” manual change (adding an index, say) legitimate? If so, deploy to Staging. Otherwise, in rare cases, remove from Production.
  - Objects that existed in both but were different – was the change legitimate but missed? If so, use Staging as the “master” and deploy to Production. Vice-versa if the reason for the difference is unknown.

We then compared QA and Staging and made similar decisions. If the change was illegitimate, remove it from QA. If legitimate, deploy to Staging for the next release.

After we succeeded in removing the discrepancies in the state of the database, the deployment failures that were once common became rare.

2. Introduce SQL Change Automation for Deployments

Automate deployment of database changes from QA to Staging to Production.

Redgate’s SQL Toolbelt provides both state-based (schema comparison) deployment tools such as SQL Compare, and tools like SQL Change Automation that can also deal with migration-first, or hybrid, approaches.

The schema-based approach is simple and elegant, and ideal for scripting out changes to Programmable Objects (stored procedure, function etc.). However, our deployments often included multi-step operations that involved changes to the database, such as to add a new column that was a ‘hybrid’ of data from other columns and then to delete the old ones. The schema-based approach can be adapted to accommodate such changes, while preserving existing data, but it requires additional manual scripting. Because of this, we opted for the greater level of control afforded by migration scripts and SQL Change Automation. These scripts allowed us to define precisely how to ‘migrate’ an existing database, safely, from one version to another

We created a SQL Change Automation project (called ReadyRoll, at the time) and, as a preliminary step, used the Staging database to create a ‘baseline script’, essentially a build script for the current version of Staging. We compared this to the QA database and verified that they were identical.

When a developer made changes to the application that had database dependencies, their team lead would import those changes directly into the SCA deployment project and they would be ready for automated deployment to QA.

After review, we started to test and verify the deployment process. With all databases aligned, we found we could consistently and successfully deploy the changes from QA through to Staging and Production, in a controlled manner.

This was a big step forward for the team.

Over subsequent release cycles, we flexed our new deployment process, tested its capabilities and limitations, and adapted it as required. For example, a release that involved changes to tables required a full cycle of application testing and deployment. However, we also needed our process to be flexible enough to provide a fast-track system for deploying small changes that were designed to fix a specific problem. What, for example, if a developer discovers an update to a stored procedure, or an index change, that will fix an urgent problem, or significantly speed up a critical business calculation? Could we apply this sort of database change right now, independently? In our trial runs, this caused us difficulties, because the same index or routine would be deployed again in the next application deployment cycle and the deployment would fail as a result. However, once we understood and documented the process to do this safely, so that the object was only created if it did not already exist (idempotent), it worked perfectly.

While we now had a much more reliable process for promoting changes from pre-production to production, we still had little visibility into the changes that were coming from one development cycle to the next.

3. Get the shared development database into version control (SQL Source Control)

By putting the shared development database into version control, the deployment team could then track forthcoming database changes much more effectively.

Our development teams adopted the use of SQL Source Control, using it to commit changes made to the shared development database into version control. This meant we now had in version control the object-level source for every database object and could see immediately which tables and routines were being amended in the current development cycle, and who made which changes and when. This had the immediate benefit of giving those in charge of deployments much greater visibility into forthcoming changes. We could review the forthcoming changes, understand their purpose, test them, and then script them into the SQL Change Automation project.

We also cut out completely the problem of required changes being missed, which still happened occasionally under our older system of having the team leads send a list. Now, we could see immediately the changes made in a current development sprint, and knew who to contact, if required.

Of course, while the shared development database was now in version control, we still had the issue that our deployment process captured only the recent, required changes. There were still many historical changes in the development database that needed to be reconciled.

However, by introducing version control we also started address this large, historical disparity between the state the development and pre-production databases. We could differentiate clearly between current and “older” changes (pre-dating install of SQL Source Control). We also had more confidence to start removing those changes we felt were likely ‘abandoned work’. Whereas before reversing this decision would require restoring a fresh copy of the database and retrieving the deleted objects, now we knew we could get any object back, or at least the scripts for it, by reverting the check-in that removed it.

Once this process is complete, we’ll have in version control a complete history of changes to all individual schemas and objects, who did them and why.

4. Better integration testing (SQL Change Automation)

Expanding our uses of SQL Change Automation, to incorporate the database into daily integration tests.

As we were building out our database tracking and deployment automation, we were also working on improving our automated testing process for the application and on building all of this into our CI/CD pipeline.

One of the challenges of running integration tests in a consistent and reliable manner is to ensure that we are always executing the tests against a known, good dataset. To improve the quality of database integration tests, we built a new database that contained a ‘pristine’ snapshot of test data, created a backup, and checked it into source control. Then, we ran the tests as follows:

1. 1. When application code changes were checked in, restore the backup to a test server
  2. Run migration scripts in the latest SQL Change Automation project to migrate the restored database to current development version.
  3. Deploy the application code changes and run the integration tests.
  4. Tear down the database

Database DevOps recommends integrating and testing database changes as frequently as possible, ideally continuously. While we have not quite achieved “Database CI”, it is another important, stepwise improvement to the quality of our testing, and therefore in the reliability of our software releases. By integrating changes to database code and application components frequently, and testing the results, problems are also becoming visible much earlier, when they are easier to manage.

Next Steps

Database DevOps at Datafaction is a continuous, steady cycle of feedback and improvement. Where next?

Test data provisioning

Although we have made use of a source-controlled database snapshot for consistent test data, we still have a need for provisioning larger data sets that are more consistent with the production system. An accounting system requires a lot of data and very often the distribution and volume of this data will have a significant impact on both test result and performance.

This means that we need to build test data sets to cover the tests required for every part of the accounting systems. We need to make this data available ‘on demand’, so that developers can, for example, spin up a fresh copy of the latest development build, fill it with the required data, run their tests and tear it down again. We are evaluating a number of options for this, one of which is the use of SQL Clone and Data Masker to get obfuscated-but-real data into Staging, and then make clones available on demand, for development use.

At this point, we’d be able to install local ‘sandbox’ copies of the database for developers, alongside the shared system. This would both give them greater freedom to experiment with database changes, without fear of disrupting others, and improve the overall stability of the shared system.

Seamlessly Integrate changes from development to QA

We continue to make step-by-step improvements to our existing database development and deployment processes. Currently, our development changes are tracked via the object level scripts, created and managed by SQL Source Control, and then those changes are deployed in a separate process, using migration scripts created and managed by SQL Change Automation.

We plan to further investigate ways that we can more tightly integrate and automate the process of tracking changes and deployment scripts between SQL Change Automation and SQL Source Control.

Summary

Any DevOps process for application delivery depends on a reliable way to test and deliver the accompanying database changes. As a team, we are clear in what we want to achieve: a fully automated pipeline for continuous delivery of application and database changes. However, our approach to achieving it is, inevitably, pragmatic. We progress in small steps that do not overturn current workflows, or risk disrupting the everyday operation of the business. This also gives us the time we need to understand the requirements and limitations of each process, and how to use our Redgate tools, and others, to reduce their complexity and improve our control over them. Above all the approach is collaborative, and its benefits are felt by all involved in it.

Datafaction is an RBC company and is an affiliate of City National Bank Member FDIC.

Redgate is a registered trademark of Redgate Software Ltd registered in the U.S Patent and Trademark Office.

Redgate is an independent company and is not affiliated with Datafaction. None of Datafaction is responsible for any information contained herein relating to Redgate or the products or services it offers.

Redgate Test Data Manager

Redgate Flyway

Redgate Monitor

Overview

Protect

Automate

Monitor

Product articles SQL Toolbelt Database DevOps
Database DevOps for Pragmatists: A…

Database DevOps for Pragmatists: A Datafaction Case Study