Database Deployment: The Bits – Versioning

Although databases have no inherent way of recording their version numbers, SQL Server provides the means of doing so, and much more besides. This is a great advantage to anyone faced with the task of deploying databases without errors.

Sometimes, when discussing deployment of databases you may talk about ‘versioning’  a database. Doing this versioning isn’t entirely straightforward, so it helps avoid having to go into the detail if you airily wave the hand at this point as if to signal that it is a trivial matter.  

There are a number of good reasons for having a version attached to a database.

  • It can prevent problems due to a version-mismatch. Where you have a simple one-to-one correspondence between an application and a database, then it helps the resilience of the application to be able to detect what version of the database it is working with. Where many applications are running against a database, it is better practice for each defined application-interface to report its own version number independently of the database because these don’t necessarily keep in step with the database.
  • It helps to monitor what’s out there. If you synchronise your development work with source control, it makes things easier to manage if the version number of the live database is kept  in sync with the version number  of the version in source control that was used to create it. You then know what is installed without having to compare it, or look it up in a separate record.
  • It helps with database deployment. When you are deploying a new version of a database to a number of machines, and you need to preserve the data that is on them, you can do it in several ways. The quickest way is to have your upgrade scripts and any data-migration scripts already prepared and tested for every permutation of upgrade between versions that you are likely to meet. To do this, you need to  know what versions are out there.  If you don’t have a record of the current version, then you have to sniff out each database with a SQL Comparison tool, and generate the upgrade scripts and data migration scripts.

There is no generally-accepted place to store a version number in a database schema. Databases don’t have version numbers by convention. There is no standard way of putting in this or any other information about an application into the database. I’ve seen many different methods for attaching version information to a database. Some shops adopt the practice of having a database of databases, with all the application interface in it. Others use special tables, often Entity-Attribute-Value (EAV) lashups. I dislike all methods except for the use of Extended Properties.

A Table has one clear advantage in that it is easy to access the information from SQL. However, this information does not get into an automatically-generated build script.  The data within a table like this isn’t  part of the database schema. The contents of a version table would need to be scripted separately.  With an extended property, on the other hand, the version information gets placed with the object when objects are automatically scripted out.  The information stays with the metadata. It seems better to attach the information directly to the object, and the mechanism to do that already exists whether one likes it or not.

By attaching a version directly to a database object, it solves other problems.  it is likely, for example, that each application interface will want a version number as well. Because it makes much more sense in implementing security if each application-interface is in its own schema, this boils down to being able to attach a version number to a schema. With so much going for the use of extended properties, it is baffling to see that Microsoft still haven’t implemented Extended Properties in SQL Azure. If you are stuck with having to deploy to SQL Azure, then you will have to use a table to store the information and be stuck with the key disadvantage that the build and version information won’t appear in your build scripts.

How to maintain a simple version number

What have we got to store? A version number is usually a string representation of a series of  three or four integers delimited by dots. These run from  left to right in diminishing importance, or maybe just a major and minor number with a code for the type of release (alpha, bets, Release Candidate (RC) and so on.

So as we don’t get to tied into fiddling with the complexities of Extended properties, we’ll  use William Brewer’s  excellent  sp_DbDoc.  (I use a version that is installed in Master as a system stored procedure so it can be used for database builds. It is available here)

If you want to attach a version ‘3.2.1’ to a database  this is achieved by …

…or you can change it by …

Or even append each version number…

… and delete it

You can’ obviously, use the raw  procedures and functions but I wouldn’t go that route because you’ll find it so tedious that you’d stop using extended properties, which would be sad.

Adding a property to a database is easy using the stored procedures Microsoft provides, of course, because it is the default object for these system stored procedures

But if it exists you get an error

… so you have to use…

…. or delete it first

As sp_DbDoc lets you see the extended properties of whatever you want to see, as well, I’d stick with it. Otherwise, you can always use  …

Maintaining a version number for a schema

You might want to add versions to schemas. This may seem odd but it is a handy way of recording the version of an application interface, if you have a number of applications, each with their own interface based on views, functions and procedures, having separate interfaces. At any one time, your interfaces may be at different versions, depending on the needs of each application and their development releases. A sensible way of handling interfaces is to put each in their own schema. This makes it far easier to engineer the security of each  one. Because of this, the schema becomes the obvious place to lodge the versioning interface of each interface. 

Let’s just show this by creating a schema

And adding an Extended property called ‘version’ with the value ‘2.5.7’

Then delete it,

… recreate it,

… and update it

Version numbers for tables or any other object you want

Having got this far, you might  want to use a similar device in order to attach version numbers to Tables, since the build scripts to tables are not held in the database and all table, columns and constraint comments are therefore lost if you auto-generate your build scripts.

Lets create a sample table

Then we attach a version number to it

delete it just to show we can

add another one in

and change it

Storing more complex information about versions

I like to store a date for when a version change happened and who did it: and why they did it. I know that a lot of this information is in Source Control, but sometimes it is hard to find.  It makes sense to copy this information from source control  into the database object on deployment. It is convenient to have this as an extended property because it is easy to add to the build script, even when it is automatically generated, and so you can see who did what, and when, whilst perusing the scripts.  It is no use having all this stored in version control only when it is 3AM, and I’m struggling with a problem with a sick production server where a database deployment has Gang aft agley.

The answer is to put it in a structured document. You can, of course use XML if you like, but you’re much likely to read raw JSON or YAML without error.  I tend to choose JSON because it is possible to interpret with TSQL whereas YAML requires an external application.

By using PowerShell v3, it is easy now to convert to and from JSON.

Which gives

..so you can read several variables out of one extended property, or even have hierarchical data. In your case, you might actually want to add a version to a list of versions the database has been upgraded to.

By converting this from  JSON  in PowerShell, adding the revision, and then converting back to JSON, then the revision is added.

… which gives …

Beware that you are limited to a string length of 7500 so you should not go mad with structured documents.  If you are repeatedly building a list, you’d do well to remove the most aged items in the list if the resulting JSON is over the limit.  It is also a help that there is a -Compress option that takes out the  white space and indented formatting in the output string, in order to save space.

This is all well and good, but you are making things more difficult to access the information from SQL. It is rare for a SQL routine to need to know the current version but it happens.

If you are sensible enough to use my  parseJSON utility procedure, then it is simple to do. Once you know the format of your JSON version record and put it in place, you just execute this to get the information as a simple table. Here it is as a view, with the JSON Parsing routine already installed in master. (Watch that in a production system!)

if you now try it out on our sample database…

…you’ll get ….

1731-clip_image001.png

Now you can get the current version number easily.

Conclusions

 The more complex a particular database deployment task gets, the more important it becomes to attach version numbers to databases, schemas  and other database objects.  At the same time, the developer must never be required to enter this information in more than one place.  This means understanding how the development process should be working and automating work such as this by scripting it.  There will be a number of places where a read-only copy of this information is kept, but only one updatable source, preferably in your source-control system.

Whatever system you are using to develop databases, it makes deployment and general admin  smoother  if you  know what versions an installation is using. By using extended properties, you can attach this sort of information firmly to the object so that it can be interrogated by scripts, viewed in the source code, inspected via TSQL or checked within the running application.