Choosing State Over Migrations

  • Comments posted to this topic are about the item Choosing State Over Migrations

  • My problem with state based solutions is they don't handle some scenarios.   If I need to rename a table or column, that usually results in a drop and create - and you lose all your data (unless configured to fail on data loss - then the process fails).  This is fairly common on our dev and test environments, but admittedly gets less frequent as we get closer to production.

    A more troubling one is if you want to add a foreign key column to a table.  Without a tool, you (1) Add the column [Allow nulls], (2) Populate the Column, (3) alter the column to NOT NULL and add the FK constraint.    What I usually tell the developers is they need a Pre-Deploy script to do 1 & 2, and then we let the deployment tool take of 3.  The Pre-Deployment script needs to be idempotent and we need a cleanup policy to remove the script after it's been deployed to all environments.   Easy on paper, but with an environment having 100's of databases and 50+ developers - not easy to wrangle.

    Factor in non-linear deployments (changesets A, B, & C are in UAT.  B needs to go to prod tonight, but A & C have bugs and are going back to development) and now you no longer have a consistent artifact you can promote, but have to create something new and different that is going to run for the first time in production.

    I always tell my developers that database deployments are hard because we can't just overwrite the server with all the current stuff like they do with code.  DBA's have to consider the state of the data when the deploy is happening.

     

     

    • This reply was modified 3 years, 2 months ago by  joe.lebaron.
  • Personally I think the article is a bit biased against the state-based approach. Since it assumes you would be applying changes to production blindly as opposed to comparing the state of production to the pre-prod environment that the change script was developed against. Assuming that the authors of both branches are always present during a merge conflict resolution involving migrations is not really realistic either.

    Main reasons I go with a state-based approach:

    • Schema drift is inherently managed and doesn't propagate beyond deployments (if you notice and ignore it, I wouldn't call it drift anymore)
    • Change scripts are mostly generated and the data-preservation patterns are constant for specific types of schema edits
    • Rollback scripts are relatively straightforward to write
    • Dropping a feature is as simple as dropping a branch, since you can delay merging feature branches far longer without generating extra work.
    • You don't end up writing migrations for changes that are overwritten with more changes before they get deployed.
    • Using the final schema as the source of truth is far more practical, since that's what's going to be running in production. Doesn't matter how clean and clever the migrations are, if the database doesn't end up at the intended state in production.
  • This article shows me two techniques I don't know much about. I've got some questions about this, to help me learn more about it.

    Whether you use state-based approach of migrations, how do you perform either action? Do you just get into SSMS (or Azure Data Studio) and run the scripts? Steve and others here have talked about using DevOps for databases. I don't know how that's done. I'm familiar with TFS and am gaining experience with Azure DevOps Services and Azure Pipelines. Next on my list of things to learn will be GitHub Actions.

    So, using Azure Pipelines as an example, I don't see how I could create a build and release pipelines that would do something as simple as add a column to a table. How is this done?

    Kindest Regards, Rod Connect with me on LinkedIn.

  • joe.lebaron wrote:

    My problem with state based solutions is they don't handle some scenarios.

    This is the main reason why people go with migrations. However, lots of customers don't go into the scenarios that aren't handled, or not very often. If I rename/split/merge things once a quarter, state is great. I can handle those exceptions.

    My vote here for state is don't let perfect be the enemy of good enough.

  • dsor wrote:

    Personally I think the article is a bit biased against the state-based approach. Since it assumes you would be applying changes to production blindly as opposed to comparing the state of production to the pre-prod environment that the change script was developed against. Assuming that the authors of both branches are always present during a merge conflict resolution involving migrations is not really realistic either.

    Main reasons I go with a state-based approach:

    ...

     

    The article is definitely biased. There are lots of easy solutions to get around the complaints the author lists. Depending on how often you do the items you listed, or how much they impact work, you might think about migrations. Schema drift certainly can be an issue,e as a change in prod might end up with a drop or other change in the script generated.

    State isn't better or worse, but has different considerations.

  • Rod at work wrote:

    This article shows me two techniques I don't know much about. I've got some questions about this, to help me learn more about it.

    Whether you use state-based approach of migrations, how do you perform either action? Do you just get into SSMS (or Azure Data Studio) and run the scripts? Steve and others here have talked about using DevOps for databases. I don't know how that's done. I'm familiar with TFS and am gaining experience with Azure DevOps Services and Azure Pipelines. Next on my list of things to learn will be GitHub Actions.

    So, using Azure Pipelines as an example, I don't see how I could create a build and release pipelines that would do something as simple as add a column to a table. How is this done?

    Rod, you either develop in state or migrations, and the "how you perform an action" isn't really relevant.

    If you make changes to dev, and then later go back to figure out what the differences are with production, you're working in state. If you write and save a script for each change you make, then you're working in migrations. It's really a question of timing, because all changes end up in a migration script of some sort. Either one big one, like from SQL Compare, or lots of little ones you've saved.

    The "how you make the changes" could be any of these, regardless of how you get to your change script(s).

    • email script(s) to someone to run in SSMS
    • write a batch file that executes script(s) with SQLCMD
    • Use a task in Azure DevOps Pipelines to run your script(s)
    • Use Flyway to execute all scripts in a folder against an instance.

    Any of these works. They are implementation details, separate from the philosophy of state v migrations.

    If you make changes in prod with SSMS, all bets are off. Keep your resume up to date.

  • @rod at work wrote

    I've got some questions about this, to help me learn more about it.

    If you are asking about the tooling and how to get started,  take a look at RedGate SQL Compare (I am not affiliated with Redgate - but it is downloadable for a free two week trial).   It will compare two databases and show you the differences.  Then it can generate a script that will make all the changes to the target to make the two identical in schema.  That's the essence of state based.   You declare one version of database to be "true" and make the other(s) look like that.  From there, RedGate has a bunch of tools to help automate (DevOps) that process.

    Another option is look up Microsoft's SQL Database Projects, which are managed through Visual Studio.   Visual Stuido is free (up to a point for non commercial use) and the tooling to automate is free (or included with SQL Server).

    For a migration based approach, what I have done is have everyone commit the change scripts into a source control system.  I then apply those changes to the target database in the order they were committed and track in that database which scripts have been run, so I know to (1) not re-run anything and (2) where to start next time I run.

     

  • Thank you very much, Steve, for the explanation. I was trying to make it too complicated, although I can see it would be possible to use an Azure Pipelines task to do it. I think it likely that someone would develop the script, then either run it themselves or give it to someone else to run in SSMS.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • My pleasure. Azure DevOPs, or Bamboo (Atlassian), or Jenkins, are just step engines. Like SQL Agent, they will run things.

    The nice thing about using these is they can abstract away permissions, connections, etc., so that you are sure of your process. No accidentally connecting to the wrong db, or giving too many permissions to someone. These can be as simple as dropping queries in a folder and using Posh or SQLCmd to run them.

    If you were looking for a way to make this simple and cheap over time, I'd recommend Flyway (Redgate) or Liquibase. Both have free, OSS editions you can use.

    Disclosure: I work for Redgate that owns Flyway.

     

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply