Malleable Source Control

,

Git has proved to be a better fit to the needs and workflow of a database development team than anything that came before. Git is valuable because it encourages branching and merging, giving more choice in the way that your team can work. Due to the ease with which you can adapt Git, there is no obvious best approach to using Git with SQL Server databases. So much depends on the way that the team works best, the nature of the database, and of the applications that use it.

The biggest problem with databases, I've found, is working out what is the 'source'. With procedural code there is nothing to debate, it is simple. You can't 'migrate' an application. You build it from source. With databases, it all gets tricky. For a start we don't major on doing builds. For a decade or so, none of us have done an offline release where we built a production server from source on every release and then imported the production data. Instead, we migrate it from the existing version to the new release whilst preserving its data. We certainly have some code in databases that looks very much like conventional procedural source. Modules or routines, by which I mean code-based objects such as procedures, views, functions, triggers and so on, fall easily into that category. Under the covers, such code gets replaced even if the object containing the code is only modified. Not so with tables, constraints, relationships and indexes. The source of a table isn't maintained as metadata, though it can be represented or generated in script-form in a rich variety of ways that all generate the same database object. In other words, a table can be represented by several different sorts of scripts that all look different; you've lost the one-to-one correspondence between the script and the created object. This makes source control, which works out what changed by comparing scripts, trickier because you can get 'false positives' just by using a different way of generating a script. To add to the head-scratching, how do we deal with data? Obviously, only the enumerations can go into source control but where do you draw the line?

Fine, you might say, a migrations approach is surely better because it reflects the natural way of developing, where you are generally altering database objects. Your comments aren't trashed, and your code builds the database, migration by migration. True, but you tend lose the narrative at the object level, when you need answers to the questions 'who changed what in this table, when and why?'. You also need to rebuild from scratch occasionally, to make sure that there are no uncontrolled changes. If your databases always migrate, all sorts of anomalies such as disabled constraints can creep into the picture and can get into the canonical source if you are careless in the way you consolidate excessive migrations.

I suspect that there will always be diverse techniques for source control in the database, because of the enormous differences in their size, nature and requirements. We will, I suspect, make best progress in refining database development techniques with tools that encourage diverse ways of working, such as Git.

 

Rate

Share

Share

Rate