Search code examples
gitgit-merge-conflict

git merge conflicts: which commit was the common ancestor?


I want to know the identity of the "common ancestor" commit during a git merge conflict resolution.

Said differently: I want to know the hash of the revision that the BASE version is being drawn from when I'm resolving conflicts during a git merge.

Hopefully, there is a command that will tell me this information?

Why I want to know

  • I am (unfortunately) doing a very complicated merge, with lots of conflicts.
  • I want to be able to visualize the two change paths (BASE -> LOCAL and BASE -> REMOTE) to give me more context about how these two sets of changes happened, who made them, when, on what branches, etc...

Helpful (?) related info

  • Recall that for any particular conflicting file, there is

    • a BASE version (git show :1:<path>), which comes from the common ancestor commit (whose identity is the answer to my question)
    • the LOCAL (branch I was on: git show :2:<path>) version and
    • the REMOTE (branch I'm merging in: git show :3:<path>) version
  • I know that I can get the SHA hash of the BASE file itself, by using git ls-files -u, which gives output like

$ git ls-files -u | grep "<path>"
100644 <SHA of BASE file> 1 <path>
100644 <SHA of LOCAL file> 2 <path>
100644 <SHA of REMOTE file> 3 <path>
  • I am using git mergetool and gvimdiff3 to view conflicts. This tool shows each conflicting file (with the "<<<",">>>","|||" conflict markers, as well as three other files for reference: LOCAL, BASE, and REMOTE. All very well and good.

  • My BASE files sometimes have conflict markers in them (!) which look like this:

<<<<<<<<< Temporary merge branch 1
<snip>
||||||||| merged common ancestors
=========
<snip>
>>>>>>>>> Temporary merge branch 2

When there is more than one common ancestor that can be used for 3-way merge, it creates a merged tree of the common ancestors and uses that as the reference tree for the 3-way merge.

  • I guess what I am seeing is that the "common ancestor" is a merged hybrid of several commits. Nevertheless, that merged hybrid must have been generated somehow, must have a SHA, and must have parents whose identities I want to know.

Solution

  • This one is tricky, and a bit nasty. You're completely correct here:

    I guess what I am seeing is that the "common ancestor" is a merged hybrid of several commits. Nevertheless, that merged hybrid must have been generated somehow, must have a SHA, and must have parents whose identities I want to know.

    As LeGEC said, you do have both HEAD and MERGE_HEAD available when Git stops with this merge conflict.

    You can find the hash IDs of the merge bases (plural) with:

    git merge-base --all HEAD MERGE_HEAD
    

    Since you are using merge-recursive, what Git did was:

    • Select two of the merge bases.
    • Run git merge-recursive on them. (This may itself find more than two merge bases; if so, see this procedure.)
    • Commit the result. This is now the merge-base-so-far. (This commit has a hash ID.)
    • Pick the next of the merge bases, if there are more than two bases, and merge that with the merge-base-so-far; this is now the new merge-base-so-far.
    • Repeat until all merge bases are used up.

    The final output of this process is a commit hash ID. This hash ID is not saved or shown anywhere. You can get all the inputs to this process from git merge-base --all of course.

    Normally, when a merge has conflicts, Git stops and makes you fix them. But when merging merge bases produces conflicts, Git just goes ahead and commits the conflicted merge bases. This is ... not good. (I'm not claiming it's bad here, just that it's not good: it gets very messy. The new merge-ort does not do this, I think, but I have yet to digest precisely what it does do.) These conflict markers are indeed what you are seeing.

    The tools Git has here are not quite up to the job, but using git merge-base --all, you can at least inspect each of the inputs.