Search code examples
gitmergegit-mergegithooks

Git pre-merge-commit hook : How do I ignore a file during a merge?


Context

I'm working in a complex git flow where some specific branches get specific submodules and some specific config files that require to be committed, but must not be merged.

These are few files but it is too dangerous to let anyone merge branches without being careful not to merge those.

In order to make it automatic, I worked on pre-merge-commit hooks, both at server and local side.

In case of conflict, I make use of .gitattributes and git/config files to resolve the conflict with a custom merge driver. It works like a charm.

Problem

However, I'm struggling to make it work when there is no conflict. In this case, the merge is carried out successfully and my pre-merge-hook is triggered. It does its magic and then exit successfully. Though, for some reason, git re-do some merging stuff after the hook which make it useless. Here is the behavior I'm witnessing :

before the merge

I got two branches, let's say A_current and B_incoming.

Both got a file I don't want to be merged. This file is called do_not_merge_me. At some point, do_not_merge_me content changed in B_incoming. Let's say it went from contentA to contentB

during the merge

What I see when I'm merging B_incoming into A_current is :

  • The merge goes on, and adds files in the staged area, including do_not_merge_me.
  • The merge succeed, so it triggers my hook
  • my hook finds do_not_merge_me in the staging area and remove it from the staging area (at the end, it's a git reset do_not_merge_me followed by a git checkout do_not_merge_me)
  • my hook ends properly, do_not_merge_me is not in the staging area anymore
  • Git does some dark magic : it redo a merge and re-stage do_not_merge_me
  • Git validate the commit, I see this added in my console :
Merge made by the 'recursive' strategy.
 do_not_merge_me               | 2 +-

  • Weirdly, after the merge is done, I got the correct versions of the files in my staged area (I'd never seen anything in the staging area after a merge, before this)

Question

The git documentation, available here https://git-scm.com/docs/githooks#_pre_merge_commit, states the pre-merge-commit is triggered after the merged is successfully handled and before the commit is validated.

My questions are:

  1. why do I get the correct version in the staged area ?
  2. Is there any way to achieve what I'm trying to do ?
  3. Why is git doing some merging stuff after the hook ? Is it a bug ?

Solution

  • The short answer is that you can't.

    When git merge runs, it reads three commits into Git's index. These three commits are:

    • the merge base (in slot 1);
    • the --ours commit (in slot 2); and
    • the --theirs commit (in slot 3).

    These are stored in the usual index format: a path name including slashes, a mode (100644 or 100755 for regular files, 120000 for symbolic links, and 160000 for gitlinks), and a hash ID.

    The first part of the merge then compares the modes to make sure those are suitable (if not, this is a merge conflict). Assuming normal files and suitable modes here, it goes on to compare the hash IDs:

    • all three equal? file is successfully merged, drop to slot 0, erase slots 1-3
    • two equal? take the third one: drop to slot 0, erase slots 1-3
    • all three unequal? leave for later, for the real merge code.

    There are a few more special cases (e.g., file exists in merge base and theirs/ours, but deleted in ours/theirs) that are also handled directly in the index, I think, but your particular case—file modified in theirs, but identical in ours and base—hits the middle "two equal? take third" case: the file is the same in your commit and the merge base, so Git just assumes that their updated file is the correct result.

    When Git does this in the early pass, it never runs your merge driver at all. The file goes to staging slot zero—"ready to be committed"—rather than conflicted and you never get a chance to do anything. Your pre-merge-commit will get invoked, but the copy of the file in the index will be the one from the theirs commit.

    We now get into the seriously dark magic part: "the index" assumes that there's a single index (.git/index) that is always used. This isn't really the case: it's mostly true, but:

    • $GIT_INDEX_FILE overrides the name;
    • added work-trees (from git worktree add) have their own index; and
    • various Git commands read the index into memory and then work with that.

    In this case, it looks like git merge has the index in-memory and just uses it as is to make the new commit. Your git add replaces the stage-zero copy in the .git/index file, but git merge does not notice this, and goes on to produce the new merge commit using the incoming copy that was there before it even ran your pre-merge-commit hook.

    Assuming this is all true—and it may change from one Git version to another, depending on when and whether Git does any re-reading of the index—this would answer your question #1, and render the answer to #2 "no" and the answer to #3 be "you're trying to do something outside the range of what Git handles".

    What you want to do is not inherently unreasonable, but Git just doesn't support it.