Search code examples
sqlalchemydatabase-migrationalembic

Is there any way to generate sequential Revision IDs in Alembic?


I'm using Alembic as a database migration tool for a Python project. When I run a command like this:

alembic revision -m "adding a column"

...it will add a new file called alembic/versions/xxxxxxxxxxxx_adding_a_column.py where xxxxxxxxxxxx is a randomly generated, 12-digit hash.

From the perspective of making things human-readable, this is a little bit problematic, because it means that when looking at the alembic/versions directory, all the files will appear in random order, rather than in sequential / chronological order.

Are there any options in Alembic to ensure these prefix revision IDs are sequential? I suppose I could rename the files manually and then update the references, but I'm wondering if there's already a feature like that baked in.


Solution

  • By the sounds of it, you are more interested in sequentially listed revision files rather than sequentially ordered revision ids. The former can be achieved without any change to how the revision ids are generated.

    The alembic.ini file that is generated when you run alembic init alembic has a section that configures the naming of the revision files:

    # template used to generate migration files
    # file_template = %%(rev)s_%%(slug)s
    

    And here is the explanation from the docs:

    file_template - this is the naming scheme used to generate new migration files. The value present is the default, so is commented out. Tokens available include:

    • %%(rev)s - revision id
    • %%(slug)s - a truncated string derived from the revision message
    • %%(year)d, %%(month).2d, %%(day).2d, %%(hour).2d, %%(minute).2d, %%(second).2d - components of the create date, by default datetime.datetime.now() unless the timezone configuration option is also used.

    So adding file_template = %%(year)d-%%(month).2d-%%(day).2d_%%(rev)s_%%(slug)s to alembic.ini would name your revision like 2018-11-15_xxxxxxxxxxxx_adding_a_column.py.

    I found this issue: https://bitbucket.org/zzzeek/alembic/issues/371/add-unixtime-stamp-to-start-of-versions which pointed me in the right direction.

    A comment from from that issue:

    timestamps don't necessarily tell you which file is the most "recent", since branching is allowed. "alembic history" is meant to be the best source of truth on this.

    So, the file naming solution will not guarantee that migrations are ordered logically in the directory (but will help IMO). The same argument could be made against having sequential ids.

    If you do want to specify your own revision identifier, use the --rev-id flag on the command line.

    E.g.:

    alembic revision -m 'a message' --rev-id=1

    Generated a file called 1_a_message.py:

    """a message
    
    Revision ID: 1
    Revises:
    Create Date: 2018-11-15 13:40:31.228888
    
    """
    from alembic import op
    import sqlalchemy as sa
    
    
    # revision identifiers, used by Alembic.
    revision = '1'
    down_revision = None
    branch_labels = None
    depends_on = None
    
    
    def upgrade():
        pass
    
    
    def downgrade():
        pass
    

    So you can definitely manage the revision identifiers yourself. It would be trivial to write a bash script to trigger your revision generation, automatically passing a datetime based rev_id, e.g. --rev-id=<current datetime> to govern order listed in the directory.

    If the revision id isn't specified, the function rev_id() found at alembic.util.langhelpers is called:

    def rev_id():
        return uuid.uuid4().hex[-12:]
    

    Function calls to rev_id() are hard-coded in the alembic source, so short of monkey-patching the function, it will be difficult to override the behavior. You could create a fork of the library and redefine that function or make the function that it calls for id generation configurable.