Search code examples
dockerflaskkubernetesskaffold

How to handle database migrations with Kubernetes and Skaffold


The issue:

Locally, I use Skaffold (Kubernetes) to hot reload both the client side and server side of my code. When I shut it down, it deletes my server pod, including my /migrations/ folder and thus gets out of sync with my database alembic_version. On production, I'm not deleting my server pod, but I am rebuilding the docker image when I deploy which results in my /migrations/ folder being replaced.

The question

How do I handle these migrations so my database doesn't get out of sync?


application setup

Flask/Python API and use Flask Migrate. For those unfamiliar, what it does is create a migrations folder with version files like 5a7b1a44a69a_.py. Inside of that file are def upgrade() and downgrade() to manipulate the db. It also records the revision and down_revision references for the alembic_version table in my postgres pod.

Kubernetes and Docker setup

I have a server pod and postgres pod. I login to the shell of the server pod to run the migrate commands. It creates the version files inside of the docker container and updates the db.

To show a step-by-step example of the problem:

  1. sh into server-deployment pod and run db init.
  2. Migrations folder is created.
  3. perform migration on server-deployment which creates migration file and updates the db.
  4. postgres pod db gets alembic_version entered and an update.
  5. use skaffold delete or ctrl-c skaffold.
  6. server-deployment pod gets deleted, but postgres doesn't. Migrations folder goes away.
  7. Start back up skaffold and sh into server-deployment pod and try to run db migrate. Asks you to do an db init.
  8. If you try to downgrade from here, it doesn't do anything. Now server pod and postgres pod are out of sync in terms of alembic_version.

Final Notes

What I used to do pre-docker/kubernetes was run it locally and I would commit that migrations version file to my repo. It was synced across all environments so everyone's repo was on the same alembic_version. I've considered created a separate, always-on "migrations-deployment" pod that is another instance of flask so that it never loses the /migrations/ folder. However, that seems like a really poor solution.


Hoping for best practices or ideas!


Solution

  • I figured out a way to handle this. I'm not going to set it as the correct answer until someone else confirms if this is a good approach.

    Basically, what I did was create a Persistent Volume Claim. Inside the server-deployment I hook up the migrations/ folder to that Persistent Volume. That way, whenever the pod gets deleted, the migrations/ folder remains and gets persisted across pod restarts.

    It would look something like this inside the server deployment.

          containers:
          ..........
            volumeMounts:
              - name: migrationstuff
                mountPath: 'MyServerApplicationDirectory/migrations'
          volumes:
            - name: migrationstuff
              persistentVolumeClaim:
                claimName: migrate-pvc
    

    The PVC would look like this:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: migrate-pvc
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 1Gi
    

    For those who decide to take this approach with flask migrate, there's a tricky "chicken or the egg" issue. When you run flask db init, it creates the migrations/ folder with some stuff in it. However, if there's a PVC creating the empty migrations/ folder, flask migrate already thinks the folder exists. You can't delete the folder either with rmdir because it's got a working process on it. However, you need to get the contents of the flask migrate init command into the empty migrations/ folder.....

    The trick I found was:

    python flask db init --directory migration
    mv migration/* migrations/
    

    This intialized all the files you need into a new "migration" folder. You then copy it all into the migrations/ folder to persist from then on. Flask migrate automatically looks for that folder if you leave out the --directory flag.

    Then delete the migration folder rmdir migration(or just wait until your pod restarts in which case it'll disappear anyways).

    You now have a proper migrations/ folder with everything in it. When you close your flask pod and restarted, the PVC injects that filled up migrations/ folder back into the pod. I can now upgrade/downgrade. Just have to be careful not to delete the pvc!