Search code examples
pythontravis-cipre-commit-hookpre-commit

How to synchronize CI and my pre-commit checks?


I set up some pre-commit hooks that I run using poethepoet library in a poetry managed project, those are working pretty well.

I also started to set a CI pipeline through TravisCI, currently this is just running my unit tests.

However, knowing that in the future this project could become collaborative, I would like to make sure my coworker's code still goes through those checks even if for some reason they run git commit or git push with the --no-verify option.

Is it even a good practice to want to synchronize the job of pre-commit hooks with an actual CI pipeline? I am starting to questioning it as I am having a hard time on finding resources trying to do it.

If for some reason reapplying pre-commit job in a CI pipeline is not good practice, what is the common way to run linters in CI pipeline when you are already using pre-commit library?

At the end, I would like to have a way to make sure the linter versions I use in my pre-commit work are the same used in the CI pipeline (and to maintain it easily, meaning if I decide to update the version of the Ruff hook, I want to make sure the Ruff version used in the CI pipeline is updated as well).


Solution

  • Your suspicion that you ought to run your hooks both in CI and as pre-commit hooks is well founded. There are a number of reasons why pre-commit hooks may not be run:

    • Someone uses --no-verify as you mentioned (note: this may not always be malicious. its certainly possible to be in a broken state and still want to push to a remote branch for the sake of a backup).
    • Depending on your pre-commit framework, its possible to forget to setup/install hooks
    • probably more

    So, assuming you are intending to run these hooks in CI and as pre-commit hooks, what's the best way to reduce code duplication?

    This may depend a lot on your preferred hook framework, but my usual strategy is to make each hook a standalone script. Something that can be run on demand at any time. This makes it really easy to run:

    • as a pre-commit hook: The actual hook just calls the lint script
    • in CI: CI job steps call the lint script
    • on demand while debugging: execute the script

    In general, I've found this makes it pretty convenient to reuse this type of pre-checkin (or at least pre-merge) with a minimum of code duplication.

    Anticipated question

    Why bother with CI and pre-commit hooks? Wouldn't it be even less duplication to just use one?

    There are benefits to running validations in CI and in pre-commit hooks:

    • validation in CI: This is the only validation that you can "guarantee" (assuming correct access privileges on your remote repo, protected branches, etc.) As noted above, there are ways to avoid pre-commit hooks. So running your validations in CI is the only sure way to maintain your project.
    • validation in pre-commit hooks: This will help you catch issues before they reach CI, which will give you faster feedback while developing, and also avoid consuming CI minutes/expending cloud resources on runners.