May 15, 2026 at 4:05 pm
The mechanics for this are relatively easy. The hard bit was the thinking to make it relatively easy.
We have a small CI/CD database that holds some data that we expect to come from various applications, and also data that we expect our error handling to cope with. As it is for CI/CD purposes, we don't want this database to be ever-growing, but we do want it to cover the range of data and problems that we would expect to see in production.
Any developer can trigger a workflow that will destroy their personal DB and recreate it from whatever the CI/CD database represents. This means that data engineers are developing against a known starting point. At the end of each sprint, we trigger this workflow.
We can deploy from a Git branch to a shared development database. Effectively, this is an integration database. This is the next step after we are happy that everything is working correctly in our personal development database.
We use infrastructure as code to deploy databases and database artefacts, and this will create DB artefacts in the integration database and the CI/CD database.
We also have health checks against the CI/CD database that test for zero record tables and will fail the build if it finds any. We use DBT, and DBT tests expect a test to return an empty result set. Before we implemented the zero record check, we were getting failures in higher environments that passed in the CI/CD run.
The important bit is that you won't get your processes right 1st time. They have to evolve. We found that our 1st attempt was an improvement from what we had before. Each evolution improved the process, sometimes by a little, occasionally a lot.
We thought a lot about our pain points. We concluded that if we want NFRs such as testability then we have to design for it. There are costs, but the rewards outweigh the costs.
Viewing post 16 (of 16 total)
You must be logged in to reply to this topic. Login to reply