Change Managment in database development?

  • This whitepaper presents what we believe are the three most common ways of applying change management techniques to database development:

    ·         Ad-hoc merging – periodic merges of modified database developer instances

    ·         Object level source control – controlling and tracking changes to your database design using a source control system

    ·         Offline development – unlike the previous two, this method does not use a 'live' database. Schemas are developed by directly editing the SQL in the creation scripts,

    It discusses the advantages and disadvantages of each approach and describes how we believe Red Gate's SQL Compare fits in to the model you choose to adopt, and improves it.

    Have we got it right? Are there alternative techniques that we've unjustly ignored?

    We'd really like to hear any feedback you've got to give!

    Cheers,

    Tony Davis (Simple Talk Editorial Director, Red Gate Software)

  • I think that about covers it for processes that are likely to deliver any real benefits.

    Of course there are always many other "processes" that people employ such as in one of the other posts whereby the production database is detached, zipped and emailed to the developers for them to apply changes however they want and is then emailed back to the production DBA to re-attach.  Laughable, of course but it shows that people can think they have a "process" when, instead they have a disaster waiting to happen.

    The first scenario, as you quite rightly point out, suffers from a lack of scalability if the database changes are complex and/or there are more than a few people making changes as there is an exponential increase in effort as changes/people increases linearly.  This is the problem that all pure "diff tools" suffer from.

    Only the last two scenarios offer a solid approach to change management - one thing I didn't see (maybe it's there and I missed it) was that there should be a automated, continuous database build implemented from the source control system with a direct feedback loop (via email) to the development team.  That way, when changes are checked in developers get notified almost immediately that there are problems and so can fix them very quickly instead of issues coming out when the next daily or scheduled build occurs when memories have faded somewhat.

    The last two approaches are what DB Ghost (http://www.dbghost.com) has been doing since 2003 and, in fact, our whitepaper on the challenges of database change management has been out there for about 3 years so I'm glad that Microsoft (with VSTS4DB) and Red Gate have finally validated what we have been stating all this time - that solid change management can only be achieved by using that most valuable piece of software - the source control system.

    Welcome to the party guys, I'm just surprised you didn't do it sooner

    Malcolm
    DB Ghost - Build, compare and synchronize from source control = Database Change Management for SQL Server
    www.dbghost.com

  • If all we're talking about is keeping schema in sync, it's simple enough -

    I run sp__revtable on the tables on my Sybase server databases that have not yet been full ported to SQL Server. If I care only about schema, I can simply drop and remake

    the tables and indices with the output.

    As always "it depends". On an established system being rebuilt elsewhere, it's far more important that the new one be like the real thing, as opposed to what the source code thinks it should be. Of course, the source code should reflect this state, so if it doesn't due to neglect over the years, I can use this same method to generate a source list for tables and SCCS those to keep a history.

    Far more critical and as yet unaddressed is doing the same for the data. HitSW DBMoto can do it if you don't mind pushing all the data every time - but that stinks for large dbs that usually change less than 1% of their data per day.

    Roger L Reid

  • Interesting. I've mostly used model 1 as referenced, and for small/medium operations it works just fine. I see a lot of value in a diff tool for change detection so that developers aren't asked to keep track of changes manually, and of course to avoid missing any changes, even if inadvertent.

    Not sure if I agree that source control/versioning should be integrated with these tools. My dream would be integration at the SSMS/db level so that any change applied - anywhere - would be immediately added to source control, and that I could right click an object and roll back to any previous state. Somewhere in there need the ability to group changes that have to be made/rolled back together. Ultimately doing the detection anywhere except DDL triggers leaves the possibility that someone, however well meaning, will bypass the version control phase.

    I haven't tried the latest SQL Compare and Malcolm, I have to admit I haven't tried your product at all, but I'll get around to it, so maybe I'm not seeing the big picture! I do look forward to reading your whitepaper as well.

  • Thank you all for your comments. We believe that allowing SQL Compare 6 Pro to support a source-controllable representation of a database schema will satisfy many database developers. We also recognise that a great solution would be to link up SQL Server itself with those source controlled files (as Andy has suggested). We are actively researching the best way to provide this sort of integration.

    One of the greatest challenges for us is to author tools that are simple and generic enough to provide as much flexibility to the database developer as possible. Bear in mind that there are a vast number of different database development models used in the field. Although our white paper describes three such models, Red Gate tools are flexible enough to support many variations of these.

    We have created a short survey at https://www.surveymk.com/s.aspx?sm=qbb94wjddl19bBnzM580ww_3d_3d

    Please help us better understand your database development process by completing the survey. We are giving away five copies of SQL Refactor to randomly selected participants. We have two additional mystery prizes to give away to the best/most comprehensive responses.

    Thanks for your assistance,

    David Atkinson, Product Mananger, Red Gate Software

  • I think that these approaches were written with only redgate in mind and not a pure approach.

    In a truly secure environment, the only changes to the schema or procedures would only occur in a source control package.  Each table, procedure, trigger..etc needs to be in the source control.  All changes are tested against a staging environment, before checking back into source.  A script will build the changes into the system....IE a release. 

    Eric

  • When I think about versioning a database I've used database in various modes:

    Goal : Anytime a change is made to a database object, any supporting application code should be committed to the version control system in the same check-in. For example, if I am adding a new column to a database table and have updated a stored procedure and have modified a middle tier object that ends up calling the stored procedure, all 3 of these changes should be committed to the version control software in the same check-in.

    What complicates this is getting the distinct changes to each object (table/sproc script/source code). The stored procedure, source code are relatively easy to deal with but dealing with the table change is the harder problem. It's relatively easy to generate a new table script, but getting other team members to use this new table will require them to drop/create the new table structure. If they have data in the old table structure they have to deal with loosing this data, or scripting it out to preserve their data.

    This problem is exactly what breaks the sandbox approach. We wouldn't tolerate this if we had to destroy our source code whenever we got latest when anther developer made changes to files we were working on.

    I can think of a few scenarios in how we manage databases:

    1. Sandbox mode : developers have local copies of the database (their sandbox) and are able to make changes to their database at will to support new features that they are working on. To support this in the best manner we would need to have modification scripts that are stored in the version control system and are applied to our database. This is essence the database diff that needs to be applied. Now for new developers coming into the project they need a way to be able to get their environment up and running, but they shouldn't have to run hundreds of patch files to get their database environment setup. There should be a full schema that represents the current 'official' version of the database schema.

    2. Shared mode : in this mode developers are using the same database running on a server. Any change that one person makes could impact another user. In one environment I worked in, we would make changes to our own objects (user owned) so that we could write/test code in isolation. Once the changes were ready we would make changes to the 'dbo' tables. This might have a cascading effect where other code would need to be changed to deal with the updated table/schema.

    3. Promotion mode : in this scenario it could be either the shared mode or sandbox mode but what we have to do is to move changes from a development environment into a QA/Test/Production database. The current Red-Gate tools could be used to accomplish this sync but if we already have diff scripts in the version control could we not use those changes to actually perform our prop.

    Another useful feature of checking in schema/procedures into version control through labeling/tagging, we should be able to get the source code/schema/procedures that all match up. So that one could run the schema/procedure scripts to create the database and the code would work against that version of the schema.

    What I've done to assist me in some of my projects is to establish some procedures as follows:

    1. There is a table that contains the version of the database schema. I tend to call this table SchemaVersion and it stores the Major,Minor,Build version of the schema.

    2. There is a Schema.sql file that contains the 'official' version of the schema. New developers on the project would run this file to get them started.

    3. For each database change a script is generated that is the 'diff' of the modification to the database schema. This 'diff' change would use the information stored in SchemaVersion table to know whether this modification needed to run in the database.

    I love RedGate's SQL Compare, but the changes that it generates I needed to wrap in a transaction that checks if the SchemaVersion table had been updated and to insert the schema version record into this table (or update the row if only tracking the current version).

    I wrote a tool that uses SQL Compare API to generate the differences but I take these differences and wrap them in my own TSQL that manages the application of the script by checking the SchemaVersion table. I made the generated SQL re-runnable so that if someone accidentally executed the script again, it would modify the structure unnecessarily.

    I've been meaning to blog about the tool that I wrote and share the code but I just haven't gotten around to it. I'd like to share with you the approach I used to see if it might fit into your plans.

  • Terry,

    How do you reconcile the full schema "official version" from what is represented by the developer's change scripts?  This is a necessary step in any approach that tries to use two versions of the truth as the two will always diverge over time.

    What normally happens (in my experience of project that use the techniques you describe) is that the "official version" simply gets out of date and so the developers end up copying each others sandbox databases, because *they work*, and the official version *doesn't* so the official version ends up being shelved or marginalized.

    Basically, in any process, you should only ever have one version of the truth - that's what we have for every other type of source code and that is what we should have for SQL code as well. 

    Any process that relies on multiple diff scripts is doomed to be resource intensive and error prone.  Why would anyone risk hand rolling an inefficient process against their company's most important asset (the database) when there are inexpensive tools on the market (such as DB Ghost and now SQL Compare 6) that are proven and make everything easy and rock solid?

    Malcolm
    DB Ghost - Build, compare and synchronize from source control = Database Change Management for SQL Server
    www.dbghost.com

  • Malcolm,

    I realize you are trying to sell your product but I've been involved with SQL Server for over 13 years and I have never heard of your company.

    How I reconcile against the 'official' database is the full schema file. In my case I run the full schema file in a temporary database, run SQL Compare against the sandbox and the temp offical schema to generate the diff which I save to a file for checkin then generate an updated full schema file.

    Now the user checks in his code, the generated diff and the updated full schema.

    If you are working in this sandbox type environment since you don't have a shared resource that you can lock, it does require the devs to communicate by saying 'hey, im about to generate patch 15'. So if another dev has changes that they need to commit, they will have to wait until patch 15 is checked in, get the latest full schema and now they can generate their diff 16.

    And keep in mind this was before SQL Compare 6 and I haven't looked to see what is available since I wrote it a couple of years ago.

  • The requirement for devs to communicate with each other before checking in diffs scripts is the fundamental problem in the process and what makes it resource intensive and prone to error.  It also leads to the exponential problem I referred to in my earlier post - as the number of developers in the team increases the amount of communication required increases expontentially.  This means that either more time and resource needs to be allocated to the process or, as is more often the case, time and resources are not made available and shortcuts are taken to hit deadlines.

    Consider the scenario where you have a team of 10 developers working remotely or across several time zones and this becomes nearly impossible to do reliably. 

    The main problem with your approach is that it doesn't use the source control system to control the order of changes and prevent multiple updates to the same database object.  As the changes are hidden inside the myriad diff scripts that the developers create every check-in works fine and you don't encounter any problems until (if you're lucky) the scripts are run against the QA environment or (if you're unlucky) during the testing phase itself or (if you're extremely unlucky) in the live system.

    Finally, yes, my company sells a product (that can be found by typing "database change management" into any search engine ) but that doesn't invalidate what I am saying.  I was hoping that this topic might bring out some processes that were beyond what we vendors are hawking and so I was slightly disappointed to see a description of a process that I have seen many times before and has some fundamental problems in deadline driven projects.

    Malcolm
    DB Ghost - Build, compare and synchronize from source control = Database Change Management for SQL Server
    www.dbghost.com

  • But your solution only supports VSS, so that invalidates your arguments about using the SCM to manage the changes.

  • Not all of us - or even most of us, work in cross continent scenarios, but Im here to learn. The discussion seems valuable. Malcolm & others, you should approach Steve about writing some articles on this to take us from the chaos where we are to where we would like to end up!

  • But Malcomn was making an argument that the approach I was using was prone to error and was solved by his product by using the SCM to manage the locking of the resource. If his product doesn't use the SCM that I'm using (Vault, TFS, Subversion) then his resource locking mechanism won't work in my scenario and thus his arguments are invalidated for my environment.

  • If I could offer one suggestion to any vendor developing an approach for database change management is if you can't support different SCMs, then try to leverage MSSCCI so at least others that are not using VSS have a chance to use/try your product by using the same API that Visual Studio uses to communicate with the SCM.

  • Was just looking on the dbghost site and it says you can manully check the scripts into other SCM's (which I grant seems like more work) so at least not 100% limited to VSS. And Im not saying he's right, dont think I understand the problem well enough to vote for a solution. Which is why some bite sized articles would be nice for the site.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply