In Git pre-commit hook, temporarily remove all changes that are not about to be commited

I would like my pre-commit hook to compile the program and run all the automatic tests before allowing to perform the commit. The problem is that usually my working copy is not clean while I'm committing. They are not staged or even untracked files that I don't want to commit. Sometimes I even explicitly specify only a few files to commit which has nothing to do with what is currently staged.

Of course I want to compile and test only the changes that will be committed, ignoring the other ones. There would be 3 steps to it:

Remove all changes that won't be committed.
Run the tests.
Restore the all the changes to exactly how they were before the 1st step.

The 1st step could be achieved by running git stash push --include-untracked --keep-index. The stash entry would also help with the 3rd step. However, I don't know what to do when I'm committing explicit list of files that are not staged.

(The 2nd step is not really a part of the question.)

The 3rd step could be theoretically done with command git stash pop --index but this command seems to be prone to conflicts if some file was staged and then changed more without staging it again.

This script creates a repository with some files and changes that cover various corner cases:

#!/usr/bin/env sh

set -e -x

git init test-repo
cd test-repo
git config user.email "you@example.com"
git config user.name "Your Name"

echo foo >old-file-unchanged
echo foo >old-file-changed-staged
echo foo >old-file-changed-unstaged
echo foo >old-file-changed-both
git add .
git commit -m 'previous commit'

echo bar >old-file-changed-staged
echo bar >old-file-changed-both
echo bar >new-file-staged
echo bar >new-file-both
git add .
echo baz >old-file-changed-unstaged
echo baz >old-file-changed-both
echo baz >new-file-both
echo baz >untracked-file

Solution

You were actually quite close to a correct solution.

(In this answer, I'm going to use the word "cache" instead of "stage" because the latter one is too similar to "stash".)

In fact, the trick with using stash would work even if you were to commit files that are not cached. This is because Git changes the cache for the duration of running hooks, so it always contains the correct files. You can check it by adding the command git status to your pre-commit hook.

So you can use git stash push --include-untracked --keep-index.

The problem with conflicts when restoring the stash is also quite easily solvable. You already have all the changes backed up in the stash so there is no risk of loosing anything. Just remove all the current changes and apply the stash to a clean slate.

This can be done in two steps. The command git reset --hard will remove all the tracked files. The command git clean -d --force will remove all untracked files.

After that you can run git stash pop --index without any risk of conflicts.

A simple hook would look like that:

#!/bin/sh

set -e

git stash push --include-untracked --keep-index --quiet --message='Backed up state for the pre-commit hook (if you can see it, something went wrong)'

#TODO Tests go here

git reset --hard --quiet
git clean -d --force --quiet
git stash pop --index --quiet

exit $tests_result

Let's break it down.

set -e ensures that the script stops immediately in case of an error so it won't do any further damage. The stash entry with backup of all changes is done at the beginning so in case of an error you can take manual control and restore everything.

git stash push --include-untracked --keep-index --quiet --message='...' fulfills two purposes. It creates a backup off all current changes and removes all non staged changes from the working directory. The flag --include-untracked makes sure that untracked files are also backed up and removed. The flag --keep-index cancels removal of the cached changes from the working directory (but they are still included in the stash).

#TODO Tests go here is where you tests go. Make sure you don't exit the script here. You still need to restore the stashed changes before doing that. Instead of exiting with an error code, set its value to the variable tests_result.

git reset --hard --quiet removes all the tracked changes from the working directory. The flag --hard makes sure that nothing stays in the cache and all files are deleted.

git clean -d --force --quiet removes all the untracked files from the working directory. The flag -d tells Git to remove directories recursively. The flag --force tells Git you know what you're doing and it is really supposed to do delete all those files.

git stash pop --index --quiet restores all changes saved in the latest stash and removes it. The flag --index tells it to make sure it didn't mixed up which files were cached and which were not.

Disadvantages of this method

This method is only semi-robust and it should be sufficient for simple use cases. However, they are quite a few corner cases that may break something during real-life usage.

git stash push refuses to work with files that were only added with the flag --intent-to-add. I'm not sure why that is and I haven't found a way to fix it. You can bypass the problem by adding the file without the flag or by at least adding it as an empty file and left only the content not cached.

Git tracks only files, not directories. However, the command git clean can remove directories. As the result, the script will remove empty directories (unless they are ignored).

Files that were added to .gitignore since the last commit will be deleted. I consider this a feature but if you want to prevent it, you can by reversing the order of git reset and git clean. Note that this works only if .gitignore is included to the current commit.

git stash push does not create a new stash if there is no changes but it still returns 0. To handle commits without changes such as changing the message using --amend you would need to add some code that checks if stash was really created and pop it only if it was.

Git stash seems to remove the information about current merge, so using this code on a merge commit will break it. To prevent that, you need to backup files .git/MERGE_* and restore them after popping the stash.

A robust solution

I've managed to iron out most of the kinks of this method (making the code much longer in the process).

The only remaining problem is removing empty directories and ignored files (as described above). I don't think these are severe enough issues to take time trying to bypass them. (It is doable, though.)

#!/bin/sh

backup_dir='./pre-commit-hook-backup'
if [ -e "$backup_dir" ]
then
    printf '"%s" already exists!\n' "$backup_dir" 1>&2
    exit 1
fi

intent_to_add_list_file="$backup_dir/intent-to-add"
remove_intent_to_add() {
    git diff --name-only --diff-filter=A | tr '\n' '\0' >"$intent_to_add_list_file"
    xargs -0 -r -- git reset --quiet -- <"$intent_to_add_list_file"
}
readd_intent_to_add() {
    xargs -0 -r -- git add --intent-to-add --force -- <"$intent_to_add_list_file"
}

backup_merge_info() {
    echo 'If you can see this, tests in the `pre-commit` hook went wrong. You need to fix this manually.' >"$backup_dir/README"
    find ./.git -name 'MERGE_*' -exec cp {} "$backup_dir" \;
}
restore_merge_info() {
    find "$backup_dir" -name 'MERGE_*' -exec mv {} ./.git \;
}

create_stash() {
    git stash push --include-untracked --keep-index --quiet --message='Backed up state for the pre-commit hook (if you can see it, something went wrong)'
}
restore_stash() {
    git reset --hard --quiet
    git clean -d --force --quiet
    git stash pop --index --quiet
}

run_tests() (
    set +e
    printf 'TODO: Put your tests here.\n' 1>&2
    echo $?
)

set -e
mkdir "$backup_dir"
remove_intent_to_add
backup_merge_info
create_stash
tests_result=$(run_tests)
restore_stash
restore_merge_info
readd_intent_to_add
rm -r "$backup_dir"
exit "$tests_result"