git git-branch git-merge git-merge-conflict

Git merge conflict on multiple files

How can I tell git to take all conflicted files from one specific branch?

I have a merge conflict while merging branch1 to branch2. Can I say take all conflicted files from branch1 instead of adding each file.

Solution

You can.

You probably should not, but you can. (At most, you probably should use -X ours or -X theirs.)

Below, after all the setup to make sure that anyone reading this answer understands what's going on, is the simple—albeit sometimes too simple—way to choose "our" or "their" file for each conflict.

Setup, or, how we got into the mess in the first place

$ git status              # make sure everything is clean
[git status is here]

$ git checkout branch2    # get onto branch2
[git's message about checking out the branch is here]

$ git merge branch1       # merge from branch1 into branch2, as in your text
... conflict messages ...

You now have some successful merges and some conflicts. For instance, perhaps the common merge base had files a.a, b.b, c.c, and so on through z.z, and since—i.e., as compared to—the common merge base commit, the first three (a.a, b.b, and c.c) were modified in branch1 and the second three (b.b, c.c, and d.d) were modified in branch2. Obviously the changes to a.a had no conflicts, and the changes to d.d had no conflicts, but perhaps b.b and c.cconflict, with several fixes made in branch1 to b.b and one different fix to b.b made in branch2, but the two fixes overlap. Something similar has happened with c.c to cause conflicts there.

You now want to discard all the fixes made in branch2 (or in branch1) by taking the file versions from branch1 (or branch2 respectively). You can use this command:

git checkout branch1 -- b.b c.c

or:

git checkout branch2 -- b.b c.c

but this requires knowing that the two files in question are b.b and c.c.

You can also do this:

git checkout --ours -- b.b c.c

or:

git checkout --theirs -- b.b c.c

These are almost the same as the git checkout commands that use the names branch1 and branch2, but with two big differences. We'll get to these in a bit.

The merge base

I mentioned the merge base above, without defining it. It's a good idea to know what the merge base is, so that you know what Git is doing and hence what you are doing. There is a precise definition (which is also a bit complicated), but to put it simply, the merge base is the most recent common ancestor commit when we look at the chain of commits that are on the two branches. That is, if we draw (part of) the commit graph, we usually see something like this:

          o--o--o--o   <-- branch1
         /
...--o--*
         \
          o--o--o      <-- branch2

The two branches have two different tip commits (to which the branch names point). These commits point to their parent commits, which point to their parents, and so on. That is, each o (representing a commit node in the commit graph) has a left-pointing arrow to its parent commit. (A merge commit has two or more such arrows; the above shows a simple case, where there are no merges.) Eventually these two branches should (normally) meet up, and from that point leftward—backwards in time—all commits are now on both branches.

The rightmost such commit, which I drew as * instead of o, is the merge base.

Git's "index" (aka staging area)

Before we get into the automatic part, we need to look at how Git defines the index, or staging-area.

The index is essentially "what Git will put in the next commit you make". That is, it has one entry for every file that will be in the commit. When you have just made a commit, the index has one entry for every (tracked) file, and that entry matches the tracked file you've just committed. Each commit has a complete snapshot of your work-tree—of all the tracked files, anyway—and that snapshot was made from the index, and the index is still the same as it was just a moment ago.

When you git add a file, you are replacing the index version with the one from your work-tree. That is, you can edit b.b to have some new content, then git add b.b to put the new b.b content into the index. The existing a.a and c.c and so on all remain unchanged, but b.b is replaced with the work-tree version.

(If you git add a totally new file such as README, that new file goes into the index, and now README is "tracked". If you git rm a file, Git puts a special "white-out" entry into the index, to mark the file as "to be left out of the next commit" even though it's currently still tracked. For the most part you can ignore these little details and just think of the index as "the next commit you can make", though.)

During merges, though, the index takes on a new role.

Automatically discovering conflicted files

There are in fact four slots in the index, not just one, for each tracked file. Normally Git uses only slot zero (0), which is where normal, not-merge-conflicted files go. When Git encounters conflicts during a merge, it puts multiple copies of each file into the index, in slots 1, 2, and/or 3, and leaves slot 0 empty.

Slot 1 is for the merge-base version of the file. Slot 2 holds the local (current or --ours) branch version, and slot 3 holds the other (to-be-merged-in or --theirs) version. In our example above, we did git checkout branch2, so slot 2 holds the branch2 version of b.b and slot 3 holds the branch1 version of b.b (since we did git merge branch1). Meanwhile slot 1 holds the merge-base version, from commit *.

Some of these slots may be empty: for instance, if a file named new.new were added in both branches, so that it has no merge-base version, slot 1 will be empty; there will be only slot-2 and slot-3 entries.

The --ours and --theirs flags to git checkout tell git checkout to use slot 2 or slot 3. There is no -- flag to extract version 1 (the merge base version), but it is possible to get it; see the gitrevisions documentation for the :n:path syntax.

You can of course also use the branch names, but that's subtly different, in two ways, as we'll see in a moment.

To find unmerged files, we simply need to find any path that is in this unmerged state. One way to do this is with git status, but this is a little tricky. A somewhat better way is to use the so-called "plumbing commands", particularly git ls-files. The ls-files command has --stage to show all the index entries plus their stage numbers. We want to know which files exist in any of stages 1, 2, and/or 3:

git ls-files --stage

which produces output like this:

[mass snip]
100755 2af1beec5f6f330072b3739e3c8b855f988173a9 0   t/t6020-merge-df.sh
100755 213deecab1e81685589f9041216b7044243acff3 0   t/t6021-merge-criss-cross.sh
100755 05ebba7afa29977196370d178af728ec1d0d9a81 0   t/t6022-merge-rename.sh
[more mass snip]

The output format specifically has the stage number (0 to 3) followed by a literal tab followed by the file name, so we want to exclude stage 0, collecting up stage 1-3 names. There are a bunch of ways to do this, such as grep -v $'0\t', but we will also need to split out the file name and make it unique (since it will occur several times if it's in all 3 slots). We can use awk to do this all at once:

git ls-files --stage | awk '$3 != 0 { path[$4]++; } END { for (i in path) print i }'

(This code is still slightly flawed as it does the wrong thing with files whose names contain white-space. Fixing it is left as an exercise.)

A more sophisticated awk (or other language) script would check that there are in fact three versions of each path, and do something—precisely what, depends on your goal—for other cases.

Note that git ls-files has a flag, -u or --unmerged, to show only unmerged files. The output is still in the --stage format (and in fact --unmerged forces --stage, just as the documentation says). We can use this instead of checking the stage-number.

Once we have a list of unmerged files, we simply run git checkout --ours -- or git checkout --theirs -- on them:

git ls-files --unmerged | \
    awk '{ path[$4]++; } END { for (i in path) print i }' |
    xargs git checkout --ours --

After this we still need to mark them as resolved, if we use --ours.

The less-subtle issue with using a branch name instead of `--ours` etc

Let's say the above does git checkout --ours -- b.b. We then have to git add b.b to mark the file resolved, i.e., to get stages 1-3 cleared out by putting the work-tree version of b.b into stage-slot-zero.

If we do git checkout branch2 -- b.b, though, we don't have to git add b.b.

You might wonder why. I certainly do. I can tell you that this is because git checkout branch2 -- b.b copies the version of b.b from the commit to which branch2 points, into stage-slot-zero, clearing slots 1-3, and then from there into the work-tree; but git checkout --ours -- b.b copies the version of b.b from stage-slot-3 (--ours) into the work-tree.

That is, git checkout sometimes copies into the index and then out to the work-tree, and sometimes just copies from an index slot into the work-tree. What I cannot explain is why git checkout sometimes has to write to the index first, and sometimes does not, other than "it was easier to write the code that way after writing a bunch of other code." In other words, there's no obvious master design principle at work here. You can make one up: "If checkout is to read from a tree, it shall write to the index, which writes slot-zero and clears 1-3; but if it can just read from the index, it shall not write to the index." But that seems pretty arbitrary.

The more-subtle issue with using a branch name instead of `--ours` etc

When Git is preparing a merge and getting all those index slots set up, it also does rename detection. Suppose that, instead of files a.a through z.z, we had a.a, sillyname.b, and c.c. Suppose that sillyname became b.b in branch1 but became bettername.b in branch2.

Git will usually automatically detect this renaming. It will then fill in slots 1, 2, and 3 with the contents of the file as named in that particular commit. Since we're on branch2 and the file is now named bettername.b, the index name will be bettername.b. The entry in slot 1 will be that for commit *'s sillyname.b. The entry in slot 2, which is our version, will be for our bettername.b. The entry in slot 3, which is the other (--theirs) version, will be for b.b.

Again, they are all under our new name, so we need to git checkout --ours -- bettername.b or git checkout --theirs -- bettername.b. If we try to git checkout branch1 -- bettername.b, Git will complain that it cannot find bettername.b in branch1. Indeed, it cannot: it's called b.b there.

Sometimes you may not care, or might want to discover this sort of rename detection. In this case, you might want to use the branch names, instead of the --ours or --theirs argument. You just need to be aware of the difference.

Why `-X ours` and `-X theirs` are different (and probably what you want)

I mentioned above the case where there were multiple fixes to b.b in branch1, and only one fix (conflicting with the fixes in branch1) in branch2.

The conflict is probably only in one area. That is, suppose there were two minor spelling errors unrelated to the more major fix. If we take the --ours version of b.b, we will lose the two spelling fixes, even while we take our superior code fix.

If we tell the merge command to use -X ours, that says that in the case of a conflict, use our change, but in the case of some change to b.b where we did not do anything, go ahead and take their change. In this case, we will pick up their two spelling fixes.

Hence, git merge -X ours is very different from a later git checkout --ours.

Getting the merge conflict back

If you want to put the merge conflict back in—if you resolved a file, but then had second thoughts—git checkout has another flag, -m, that does this:

git checkout -m b.b

This "re-merges" the file, putting the conflict markers back in. (It actually uses a special set of "undo" index entries, since the merge-base version may be a virtual merge base rather than a single commit. It can only be used before the merge is committed; after that, you must re-do the entire merge.)