Search code examples
pythonbuildcontinuous-integrationpypipython-packaging

Automating Python package release process


I've just started an open source Python project that I hope might be popular one day. At the moment to release a new version I have to do a few things.

  1. Test all the things.
  2. Edit mypackage.VERSION variable, which setup.py imports from __init__
  3. Build packages and wheels with python setup.py sdist bdist_wheel
  4. Write a changelog entry to CHANGELOG file
  5. Commit my changes, echo some of that changelog
  6. Tag that commit as a release, copy that changelog entry over again.
  7. Drag in my built files so people can download them from the release
  8. Use Twine to push the packages up onto PyPI
  9. Test again on my staging server via PyPI.

If I had to sum up everything I hate about my project in nine bullet points, I think we'd be looking at a very similar list. The thing that cuts is that past me making up a new version number and writing the commit/changelog message, this is painfully dull.

Can I automate any of these tasks in such a way that I might be able to, for example, let GitHub CI do everything just from my commits?

I already have a decade of Python experience, and a bit of CI, but I'm very new to packaging Python and actively interacting with PyPI. I suspect I'm not the only person driven crazy by the manual repetition here, I'm just looking for tools (or services) that can make this process easier.


Solution

  • The following is my own opinionated take on your list. There is a certain range of automation you can achieve, and I'll try to provide a reasonable starting point, and then some hints on how you can go further from there.


    CI without CD

    Adopting this part should already get rid of most of the annoying manual work, and you can automate away more and more as the need arises. If you're not comfortable maintaining a good amount of CI code, you should start here.

    Things you'll need are a CI (as you already noted) and a package manager. Something you won't get around is pushing your changes and a new tag with git, so parts of step 5 and 6 remain manual.

    Package management

    I'll use poetry to keep things concise and because I like it[1], but there are also other options. This will take care of steps 2, 3, 7, 8, and the unlisted step 10, "update my dependencies and test them for compatibility", which is incredibly annoying as soon as it turns out to be a problem.

    The bad news when using poetry is that you'll need to move all packaging configuration into a new file, pyproject.toml. The good news is, that you don't need a separate setup.py, setup.cfg, MANIFEST.in, or requirements.txt any more, since pyproject.toml is a provisional standard for packaging and other tools, and poetry also has a walkthrough on how to port over all the relevant info.

    Once the setup is ready, the new deployment workflow would be:

    $ poetry update           # update dependencies, may be skipped 
    $ poetry version patch    # bump version
    Bumping version from 1.1.2 to 1.1.3
    # finalize git stuff, e.g. add -u, commit -m 'v1.1.3', tag v1.1.3, push
    $ poetry publish --build  # build and publish to PyPI
    Building my_django_lib (1.1.3)
     - Building sdist
     - Built my_django_lib-1.1.3.tar.gz
    
     - Building wheel
     - Built my_django_lib-1.1.3-py3-none-any.whl
    
    Publishing my_django_lib (1.1.3) to PyPI
     - Uploading my_django_lib-1.1.3-py3-none-any.whl 100%
     - Uploading my_django_lib-1.1.3.tar.gz 100%
    

    This should already be a lot shorter than what you're currently doing. If you always execute the exact same git commands, are not afraid to automate a push, and take good care of your .gitignore file, feel free to add something like this function to your ~/.bashrc and call it instead:

    git_cord () {
      version=$(grep pyproject.toml -e '(?<=^version = ")(.*)(?=")' -Po)
      git add -u
      git commit -m "${version}"
      git tag "${version}"
      git push -u origin "${version}"
    }
    

    Getting started with gitlab-CI

    The CI can in principle handle everything surrounding the deployment process, including version bumping and publishing. But the first requires that your CI can push to your repo (which has annoying side effects) and the latter that it can publish to your PyPI (which is risky, and makes debugging the CI a pain). I think it's not unusual to prefer to do those two steps by hand, so this minimal approach will only handle step 1 and 9. More extensive testing and build jobs can be included afterwards.

    The correct setup of a CI depends on which one you plan to use. The list for github is long, so I'll instead focus on gitlab's builtin CI. It's free, has very little magic (which makes it comparably portable), and the binaries for the CI runners are open, free, and actually documented, so you can debug your CI locally or start and connect new runners if the free ones don't cut it for you.

    Here is a small .gitlab-ci.yml that you can put into you project root in order to run the tests. Every single job in the pipeline (skipping setup and install commands) should also be executable in your dev environment, keeping it that way makes for a better maintainer-experience.

    image: python:3.7-alpine
    
    stages:
      - build
      - test
    
    packaging:
      stage: build
      script:
        - pip install poetry
        - poetry build
      artifacts:
        paths: 
          - dist
    
    pytest:
      stage: test
      script:
        - pip install dist/*.whl
        - pip install pytest
        - pytest
    

    Setting up the build and test stage like this handles steps 1 and 9 in one swoop, while also running the test suite against the installed package instead of your source files. Though it will only work properly if you have have a src-layout in your project, which makes local sources unimportable from the project root. Some info on why that would be a good idea here and here.

    Poetry can create a src-layout template you can move your code into with poetry new my_django_lib --src.

    The changelog

    While there are tools out there that automatically create a changelog from commit messages, keeping a good changelog is one of those things that benefit greatly from being cared for by hand. So, my advice is no automation for step 4.

    One way to think about it is that the manual CHANGELOG file contains information that is relevant to your users, and should only feature information like new features, important bugfixes, and deprecations.

    More fine grained information that might be important for contributors or plugin writers would be located in MRs, commit messages, or issue discussions, and should not make it into the CHANGELOG. You can try to collect it somehow, but navigating such an AUTOLOG is probably about as cumbersome as sifting through the primary sources I just mentioned.

    So in short, the changelog-related parts of step 5 and 6 can be skipped.


    CI with CD

    Adding CD doesn't change too much, except that you don't have to release by hand any more. You can still release with poetry in case the CI is down, buggy, or you don't want to wait for the pipeline to release a hotfix.

    This would alter the workflow in the following way:

    • everyday work
    • write code (can't avoid this one yet)
    • document progress in commit messages and/or MRs (I prefer MRs, even for my own changes, and squash all commits on merge)
    • push to gitlab / merge MRs
    • on release
    • create a tag, run poetry version and maybe poetry update
    • write release notes in CHANGELOG
    • push to gitlab

    This addition to the former .gitlab-ci.yml file should work right away if you supply the secrets PYPI_USER and PYPI_PASSWORD:

    stages:
      - build
      - test
      - release
      
    [...]  # packaging and pytest unchanged
    
    upload:
      stage: release
      only:
        - tags
        # Or alternatively "- /^v\d+\.\d+\.\d+/" if you also use non-release
        # tags, the regex only matches tags that look like this: "v1.12.0"
      script:
        - pip install poetry
        - poetry publish -u ${PYPI_USER} -p ${PYPI_PASSWORD} dist/*
    

    Some useful links:

    • .gitlab-ci.yml documentation
    • list of predefined variables, this is where most of gitlab CI's obscurities lie
    • the long version of my .gitlab-ci.yml template, with additional stages that may or may not be useful to you. It expects a src layout of your code.
      • lint: type checking, coverage, and code style
      • security: checking your own code and your dependencies for valnuarabilities
      • release.docs: public gitlab pages section where docs are served that are created automatically based on your docstrings
      • The build stage creates a wheelhouse from the poetry.lock file that can be used for installing dependencies later in favor of PyPI. This is a little faster, saves network bandwidth, and asserts the use of specific versions if you want to debug, but might be overkill and requires the use of a poetry pre-release.

    [1] Among other things, poetry also 1) handles the virtualenv for you, 2) creates a hashed lockfile in case you need reproducible builds, and 3) makes contribution easier, since you only have to run "poetry install" after cloning a repo and are ready to go.