Restoring git merge conflict flags

I am trying to work out a method of sharing merge conflicts with other members of my team. We have some very large branches and merging them creates a lot of conflicts. I have tried several different methods, and my current attempt involves pushing the files in a conflicted state to the remote repo (leaving the merge markers in the files), and then running an annoyingly long alias to grep through the files and re-create the merge files manually (LOCAL, BASE, REMOTE).

I recently found the

git checkout --conflict=merge -- (file)

command, which works great on a local branch, but as soon as it gets pushed to a remote, this command no longer works to restore the merge flags.

Is there a way to force git to re-flag a file as conflicted so people can use the normal merge tools to resolve them?

Solution

Is there a way to force git to re-flag a file as conflicted so people can use the normal merge tools to resolve them?

No, not without writing your own code anyway. (Someone should write some code, and maybe that should be be—it's time to have a tool for this. There are a bunch of corner cases that are hard, though.)

The problem here is that a conflicted merge, in Git, is represented by state stored in Git's index.

Stepping back for a moment, let's define the index, along with current commit or HEAD, and work-tree (or work tree, working tree, and a bunch of similar variants):

The current commit, known also as HEAD or HEAD, is pretty straightforward. (I like to use HEAD, in computer-text layout like this, to mean specifically the Git name HEAD. You can also use @ in Git version 1.8.5 or later. This special name refers to the current branch, if there is a current branch, and the current branch then locates the tip commit of that branch, which is the current commit. Or, in "detached HEAD" mode, HEAD directly contains the hash ID of the current commit. Either way, this names the current commit.)
The work tree is, quite simply, where you do your work. Git's internal data structures that hold commits and versioned copies of files are not suitable for anything else, so Git extracts the versions into ordinary files, which you can then read and manipulate as usual.

The work tree can also hold files that you have not yet, and do not want to, commit. These are untracked files. (Technically, an untracked file is any file in the work-tree that is not already in the index, but we haven't defined the index yet. :-) )

Git tends to complain about untracked files being untracked; you can shut off these complaints by listing the files, or their path name patterns, in .gitignore files. Note that adding a file name to .gitignore does not make the file untracked. If the file is tracked, it stays tracked. The .gitignore entry primarily just shuts off complaints, and also makes Git not automatically add these untracked files when you say "add all files".
The index sits halfway between these two. Normally, when you first check out a commit or branch, the index contents match the HEAD commit contents, which Git also extracts into the work tree. You can then modify the work-tree all you like, but the index continues to match HEAD. You must git add files to copy them from the work tree, back into the index.

As such, the index essentially represents the next commit you will make. When you run git commit, Git turns the index into a new commit (which automatically becomes the new HEAD commit). Only things you copied back into the index get committed, which means you can split up changes into several commits by just git adding a few files at a time. (And, you can use git add -p to add just part of a file, rather than the whole file, so that the index version itself is partway between the HEAD commit version and the work-tree version.)

If you never do any merges, or never have any merge conflicts, we could stop here and be done with the index. Of course, you're doing merges, and they are hitting conflicts, so we need to look closer.

The index has four "stage slots" per index entry

The index records files by their path names in the work-tree. If you modify some of these files, and git add them to be ready for the next commit, this updates the index version of the files. But there is a secret that shows up if you run git ls-files --stage during a conflicted merge. There is no other time that this normally shows up—it's only during conflicted merges. The secret is that each file can be in the index up to three times, in stage slots that are numbered. Slot zero is the normal, everyday slot:

$ git ls-files --stage
[snip]
100644 d8d18736e74c7a5f61d794770a2dd94786501d12 0   Makefile
100644 046dcab7645305cbf4b94adef54a859234ac3caa 0   README
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0   lib/__init__.py

The column of all-zero values indicates that each of these files is in slot zero.

During a conflicted merge, though, README, or one of the other names, may have up to three entries in slots 1, 2, and 3. In this case, Git knows that there was a merge conflict. Slot 1 holds the merge base version of the file. Slot 2 holds the HEAD or --ours version, and slot 3 holds the MERGE_HEAD or --theirs version, of that same file. (Slot zero is, by definition, unoccupied at this time.)

Several of these slots can be empty. (I used to say that at most one of them could be empty, but I was wrong: with rename/rename or rename/delete conflicts, we can in fact see more than one empty slot.) The empty slot or slots indicate that no file had that name in one or more of the three inputs to the merge. The existence of any higher-numbered entry, though, indicates that there is a conflicted merge going on.

It is, as you have seen, your job to resolve these conflicts. The work tree version of the named file normally contains Git's best attempt to resolve the merge as of this point, but since Git was not able to resolve the conflict, the work tree version has conflict markers in it. After you resolve the conflict, you should run git add on the path as usual (or git rm if the resolution is to remove the file). This clears out the higher-stage slots, while also copying the work-tree file to slot zero unless the file is really removed. Now the conflict is resolved.

If you are in the middle of a merge, have not committed the result yet, and have edited or even resolved a file but wish to restore it to its original unmerged state, you can, as you noted, use:

git checkout -m -- <path>

(or the same with --conflict). You can add =<style>, which allows you to specify the conflict style: merge or diff3 (I prefer diff3, which includes text from the merge base version of the file). This removes the stage-zero entry, if you made one, and restores the higher-stage conflicting entries. This particular form of git checkout, though, requires that the original unmerged entries be available in the index.

In any case, you cannot make a new commit until all higher-stage index entries are resolved. That is, if git ls-files --stage shows any entries that are not stage zero, you may not make a new commit.

... as soon as it gets pushed to a remote, this [git checkout -m] command no longer works to restore the merge flags.

In fact, it's long before that. The ability to restore the conflict goes away once you commit. This permanently cleans out all the higher-stage index entries. But you cannot push an index, and you cannot push a file: you can only push commits. This means that to push a partial merge (to let someone else deal with it), you must resolve the merge and commit. Now it's no longer in progress and cannot be made to be in progress any more.

What is needed is a tool that can save the full index state, the work-tree files, the merge state including the IDs of the two commits HEAD and MERGE_HEAD—this implies the ID of the merge base—and perhaps even untracked and/or ignored files (a la git stash) into a special commit or set of of commits stored on a non-branch reference. This commit, or these commits, can then be transferred from one repository to another. A reverse version of the same tool can restore the merge state, index state, and work-tree. All the components necessary to build such a tool exist (because git ls-files --stage and git update-index both exsit). But writing this tool-pair would be complicated, probably at least as difficult as the git stash script.