Consolidating Doctrine Migrations

by pschwisow

Doctrine Migrations are a great way manage changes in your database structure, but over time migration files can pile up and turn into an unmaintainable mess. Migration usage is well-documented, but developers are left to their own devices when it comes to long-term maintenance. This article outlines a process to clean-up the mess and start fresh with a new base migration with minimal impact on existing database instances.

Note: This article was originally published in the March 2015 issue of php[architect].

What are Doctrine Migrations?

The purpose of Doctrine (or any ORM) is to separate the business / domain logic from how it is persisted in the database. When requirements change, both the entities and the underlying database schema must be updated. Synchronizing code changes with schema changes manually is tedious and error-prone. Doctrine Migrations automate the process. Migration classes can be created manually or automatically generated based on the differences between entity definitions and the current state of the database. The migration system tracks which migrations have already been run, so the database can easily been brought from any past version to the current schema.

Notes to Readers:

  • I will attempt to go into enough detail that readers without experience using Doctrine Migrations can follow the process, but I will not attempt to go into details about setting up Doctrine DBAL, ORM, or Migrations in this article. Before attempting the steps outlined below, you should by familiar with Doctrine Migrations and know how to it is configured in your application. See the excellent documentation at http://www.doctrine-project.org/projects/migrations.html for more information.
  • All examples which show Doctrine Entities will use docblock annotations for entity metadata. Doctrine may also be configured to read metadata from YAML or XML, so please adjust as necessary.

Migration Commands

Doctrine provides a command-line tool for completing various tasks related the the DBAL, ORM, and Migrations packages. Depending on how Doctrine was installed and configured, there may be some differences in how you execute the commands. Doctrine’s documentation assume that you run commands as ./doctrine {command_name}, and I will be using this convention as well. If you installed DoctrineMigrationsBundle in a Symfony application, replace ./doctrine with php app/console and prepend doctrine: to your command names. For instance ./doctrine migrations:status becomes php app/console doctrine:migrations:status. You can see the entire list of commands by running ./doctrine list.

The Migrations commands are:

  • migrations:execute – Run a single migration or undo it with --down option
  • migrations:generate – Create an empty migration class
  • migrations:latest – Print the version number of the latest migration class
  • migrations:migrate – Execute migrations up or down to get to the specified version number (or the latest version if the version number is omitted)
  • migrations:status – Print detailed information about the current migration status
  • migrations:version – Mark specified migration version as complete or not complete without actually executing any of its queries

Basic Migration Example

Let’s say you want to add a new field to your existing Foo entity. First add the field to the entity.

/**
 * @ORM\Column(type="string", length=100)
 */
 private $bar;

Then generate a migration file based on the differences between the entities and the current database schema.

$ ./doctrine migrations:diff
Generated new migration class to "/some/path/DoctrineMigrations/Version20141103080145.php" from schema differences.

Migrations are automatically numbered based on the date and time that they are generated, so /some/path/DoctrineMigrations/Version20141103080145.php contains a migration generated on 2014-11-03 (November 3, 2014) at 08:01:45 (forty-five seconds after 8:01 a.m.). If several migrations are run at the same time, they will always run in chronological order. If several developers create migrations on their branches and merge later, there will be no conflicts with filenames (unless the happen to generate their migrations at the exact same second), but it’s still possible that their SQL statements will not work together. The migration file contains a PHP class with SQL statements to migrate up to the new schema structure or migrate down to return to state prior to this change.

namespace DoctrineMigrations;

use Doctrine\DBAL\Migrations\AbstractMigration,
    Doctrine\DBAL\Schema\Schema;

class Version20141103080145 extends AbstractMigration
{
    public function up(Schema $schema)
    {
        $this->addSql('ALTER TABLE foo ADD bar VARCHAR(255) NOT NULL');
    }

    public function down(Schema $schema)
    {
        $this->addSql('ALTER TABLE foo DROP bar');
    }
}

Now we can check the current migration status and execute any migrations that have not yet been run.

$ ./doctrine migrations:status

 == Configuration

  >> Name:                 Doctrine Database Migrations
  >> Database Driver:      pdo_mysql
  >> Database Name:        testdb
  >> Configuration Source: /some/path/migrations.xml
  >> Version Table Name:   doctrine_migration_versions
  >> Migrations Namespace: DoctrineMigrations
  >> Migrations Directory: /some/path/DoctrineMigrations
  >> Current Version:      2014-10-31 13:04:22 (20141031130422)
  >> Latest Version:       2014-11-03 08:01:45 (20141103080145)
  >> Executed Migrations:  10
  >> Available Migrations: 11
  >> New Migrations:       1

$ ./doctrine migrations:migrate
Are you sure you wish to continue? y
Migrating up to 20141103080145 from 20141031130422

  >> migrating 20141103080145

    -> ALTER TABLE foo ADD bar VARCHAR(255) NOT NULL

  >> migrated

Too Much of a Good Thing

The first migration file for your project should take you from an empty database to the schema that corresponds to your entities in an early version of your application. Over time you will accumulate additional migrations that add, update, and delete tables and fields. With a bit of refactoring (and an occasional wrong turn that later gets undone), you may find that later migrations are “undoing” earlier migrations. There is also overhead in the process of executiing migrations, so one large migration will complete much more completely than many small migrations. Cruft builds up until you reach a point where it becomes more efficient to start again with a blank slate and create a new “base migration”.

Creating the New Base Migration

Before you start creating migrations, check out the latest version of your application and confirm that your development database matches the latest migrations. (Refer back to Listing 2 to see the steps required.) To create a new base, you’ll need an empty database to work with. Using your MySQL client of choice, run CREATE DATABASE migration_temp; and update your Doctrine configuration to point at the new database. Now generate a migration that’s a diff between your empty database and the current entities.

$ ./doctrine migrations:diff
Generated new migration class to "/some/path/DoctrineMigrations/Version20141103161501.php" from schema differences.

Because migrations may contain additional queries and other actions that were manually added, you should look through all the old migrations. Look for queries that contain INSERTs, UPDATEs, or DELETEs. Find any preUp, preDown, postUp, and postDown methods. These will need to be manually added to your new base migration, but use your judgment to determine if any are contradictory or no longer needed. You can inspect the SQL statements that will be run by the migration without actually executing them by using the --dry-run option.

$ ./doctrine migrations:migrate --dry-run
Are you sure you wish to continue? y
Executing dry run of migration up to 20141103161501 from 0

  >> migrating 20141103161501

    -> CREATE TABLE user (username VARCHAR(255) NOT NULL,
<snip>

Cleaning up

At this stage, your new base migration exists as the newest of many migration files. Previous migrations are no longer needed and will only cause problems if left in place. (Running all migrations at this point would attempt to create tables on top of already existing tables.) The simple solution would be to delete all the previous migration files; however, I would suggest archive them in case you need to reference them in the future. (Even if you do delete them, your version control system should allow you to track them down later, but digging through history to find a file that was deleted some time ago is a frustrating exercise at best.) Create an “Archive” directory within your migrations directory and move all but your new base migration into this folder. (The actual directory name and location are not important, but I prefer to keep the archives in a sub-directory of the migrations directory.) If your version control system maintains history when files are moved, you can still easily research the changes that relate to the original migrations. If you use git, the series of commands below will create your archive folder and move all but one migration into it (adjust the path and filename as necessary).

cd /some/path/DoctrineMigrations/
mkdir ./Archive
find . -maxdepth 1 -mindepth 1 -type f -not -name Version20141103161501.php \
-exec git mv '{}' ./Archive \;

Now we can verify that the new base migration works correctly. Run the migration and then confirm that the database schema in your temporary database matches the schema of your original development database. You can do this either by visual inspection or by dumping the table structures of both databases and using a diff tool to compare them.

Tying up Loose Ends

We’ve now confirmed that the new base migration will work correctly when starting from an empty database, but what happens when you deploy your changes on an existing database? Change your application configuration to point once again at your existing test database and check the migration status:

$ ./doctrine migrations:status
<snip>
  >> Executed Migrations:              11
  >> Executed Unavailable Migrations:  11
  >> Available Migrations:             1
  >> New Migrations:                   1

If you run this command on a terminal that supports color, the 11 “Executed Unavailable Migrations” will be prominently formatted as an error. Migrations are tracked in the doctrine_migration_versions table, and this table now lists migration files that no longer exist in DoctrineMigrations directory. To handle this problem, we have to be a bit clever and leverage what we know about how migrations are tracked to create two branches of logic. If the migration is run on a fresh database, create all the tables (move all of your existing queries into this branch of the logic). If the migration is run on an existing database, clean up the migration versions table and mark this migration as run without executing any of the table creation statements.

    public function up(Schema $schema)
    {
        if (count($this->version->getConfiguration()
            ->getMigratedVersions())) {
            // Structure already exists, clean up archived versions
            $this->addSql("TRUNCATE doctrine_migration_versions");
        } else {
            // Empty database, create tables
            // -- PUT ALL OF THE addSql CALLS HERE --
        }
    }

Once this is complete, do thorough testing of both up and down migrations on both empty databases and databases starting at the prior version. Make sure that all these cases work properly. You should notice in your testing that the down migration will always result in dropping all tables. Migrating from the prior version to the new base migration wipes out prior migration history, so there is no way to know whether the prior state was an empty database or not. We can’t really back out of the migration, so test carefully in a “safe” environment before deploying any of your changes.

Caveats for Consolidating Doctrine Migrations:

  • Be careful with code merges around the time of your changes. (Remember that migrations are run in chronological order.) If a migration with an earlier date / time than your base migration is added, then it would be run before the base migration when deploying on an empty database. There are too many potential scenarios to go into detailed solutions here.
  • Running the new base migration on databases that were not up-to-date prior to this will cause a major failure. Our logic assumes that if any migrations have already been run, the database schema is up to date. If you practice continuous deployment, this problem is unlikely, but you should be cautious of the way you deploy this change.

Regular Maintenance

The consolidation process outlined above can be repeated as often as needed. The frequency is determined by the quantity and nature of changes to your database schema. On a moderately active project, yearly consolidation might be appropriate. After completing a major change in schema, it might be worth considering if consolidation should be done in the near future.

Advertisements