On Heroku, is there danger in a Django syncdb / South migrate after the instance has already restarted with changed model code?

On Heroku, as soon as you push new code, the web-serving instances restart... even if the underlying database schema additions/changes (via syncdb or south migrate) haven't yet been applied.

In many cases, this might just cause harmless errors undtil the syncdb/migrate is run soon afterward. But I'm concerned that in some cases, new code might half-work making unexpected changes in the pre-migration database.

What's the right way to be safe against this risk?

One technique might be to add the syncdb/migrate to the Procfile so it's run before web restart. But, in the case of multiple instances, or maybe even a case where the one old-code-instance is left running until the moment the one new-code-instance is known-up, there's still a variant of the issue where code is talking to a DB with a mismatched schema.

Is there a 'hold all web instances' feature (or common best practice) for letting the migrate complete without web traffic?

Or am I being overly concerned about a risk that is negligible in practice?

Solution

The safest way to handle migrations of this nature, Heroku or no, is to strictly adopt a compatibility approach with your schema and code:

Every additive or transformative schema change must be backwards-compatible;
Every destructive schema change must be performed after the code that depends on it has been removed;
Every code change must either be:
- durable against the possibility that associated schema changes have not yet been made (for instance, removing a model or a field on a model) or
- made only after the associated schema change has been performed (adding a model or a field on a model)

If you need to make a significant transformation of a model, this approach might require the following steps:

Create a new database table to hold your new model structure, and deploy that migration
Create a new model with the new structure, and code to copy changes from the old model to the new model when the old model changes, and deploy that code
Execute a migration or code action to copy all old model data to the new model
Update your codebase to use the new model rather than the old model, deleting the old model, and deploy that code
Execute a migration to delete the old model structure from the database

With some thought and planning, it can be used for more drastic changes as well:

Deploy code that completely removes dependence on a section of the database, presumably replacing those sections of the site with maintenance pages
Deploy a migration that makes drastic changes that would not for whatever reason work with the above dual-model workflow
Deploy code that brings the affected sections back with the new model structure supported

This can be hard to organize and requires strict discipline and firm understanding of your code's interaction with your database, but in practice, it does allow for most changes to be made with no more downtime than the server restart itself imposes.