Search code examples
git-merge-conflictgit

git rebase with 'ours' merge strategy prompting to rebase --continue again and again


Here is the issue I am facing

I had created a feature branch from the master branch. I worked extensively on a feature branch, it is 80 odd commits ahead of the master branch. In these commits, I have edited some files multiple times. After a few days, someone pushed a couple of commits on the master branch thus Pull Request of the feature branch can't be merged due to merge conflicts.

I tried rebase to master and resolve the merge conflict, but I am getting more and more conflicts after git rebase --continue

govi@falcon:/home/my_user/project/ (feature/xyz): git rebase master
govi@falcon:/home/my_user/project/ (feature/xyz | REBASE 32/85):

Here for any merge conflict, I want to select my changes. so I tried ours conflicts resolution strategy in recursive mode. Now git is not forcing me to resolve any conflicts but it is asking me to execute git rebase --continues almost 80 odd times.

govi@falcon:/home/my_user/project/ (feature/xyz): git rebase master -s recursive -X ours
govi@falcon:/home/my_user/project/ (feature/xyz | REBASE-i 1/85)
Last command done (1 command done):
     pick db2511c Modify file
Next command to do (1 remaining command):
     pick d1c2037 Modify file one more time

is there a better approach to resolve merge conflicts in the above scenario? or maybe a better way to rebase?

PS: We are not allowed to reset the master branch. I know easy way would be to perform {reset, stash, rebase, pop} on the feature branch, but PR is already in progress.


Solution

  • TL;DR

    You really want -X theirs. Why you want that is ... long.

    Long

    First, an initial side note: be careful with terminology: you are not using the ours strategy but rather the ours strategy option. I find Git's terminology here confusing, and prefer to call these -X options extended options, to avoid repeating the word strategy.

    Now, on to the problem itself. When using git rebase, you are, in effect, repeatedly running git cherry-pick. Each cherry-pick operation copies one commit; git rebase works by copying multiple commits. The git rebase command first lists out all the hash IDs of the commits that are to be copied, saving these into internal "to-do" files. These files then get updated as the rebase makes progress.

    (The details for these files have changed over the years and there's no real point in describing them. However, your shell prompt settings appear to read these to-do and progress files correctly, based on the "1/85" and "32/85" you're seeing here.)

    A cherry-pick operation is, technically, a full-blown three-way merge, and can therefore produce merge conflicts. But one must be quite careful here. You wrote:

    git rebase master -s recursive -X ours
    

    The strategy argument to git merge or git rebase is -s or --strategy; you are using recursive here, which is fine (an ours strategy is not). The extended options are -X, and an ours or theirs extended option does make sense—but there's a trap here: you want -X theirs.

    What's going on

    Before we dive into cherry-pick, let's look at git merge. Without this first look at git merge, some of what cherry-pick does makes no sense at all.

    To do a git merge operation, we start with a series of commits where, e.g., two different developers started with the same initial chain of commits:

    ...--F--G--H   <-- main
    

    These two developers, who we'll call Alice and Bob in the usual way, have each made some new commits. I'll work here from Alice's point of view:

           I--J   <-- alice (HEAD)
          /
    ...--H
          \
           K--L   <-- bob
    

    At this point, Alice might merge Bob's work. She has her commit J checked out, with the special name HEAD attached to the branch name alice; she now runs git merge bob to merge Bob's commit L.

    The git merge command—technically, this is the recursive strategy rather than git merge itself—locates commit L using the branch name bob. This commit becomes the third commit. Git locates commit J using the special name HEAD, and this becomes the second commit. Last—which becomes first—it works backwards through the commit graph to locate the best common commit, which in this case is commit H.

    Each commit has a full snapshot of every file that Git knew about when whoever made the commit, made the commit. So Git can now easily compare the snapshot in the merge base commit H against the snapshot in Alice's commit J, and then do the same thing with Bob's commit L:

    git diff --find-renames <hash-of-H> <hash-of-J>   # what Alice changed
    git diff --find-renames <hash-of-H> <hash-of-L>   # what Bob changed
    

    Note that the three commits in question here are:

    1. commit H, as the merge base;
    2. commit J, as --ours, via HEAD; and
    3. commit L, as --theirs, via the name bob.

    The merge command—the merge as a verb part of it, that is—now combines our changes, H-vs-J, with their changes, H-vs-L. It is this combining process that can produce merge conflicts.

    To the extent that there aren't merge conflicts, though, Git can automatically apply the combined changes, to the files as seen in the merge base commit H. This keeps our changes while adding their changes, which is of course just what we want from a merge.

    When there are merge conflict, git merge stops in the middle of the merge. It leaves in Git's index all three input files: index slot #1 contains the base commit copy, slot #2 contains the --ours copy from HEAD, and slot #3 contains the --theirs copy from the commit we named with our git merge command.

    Git writes, to the work-tree version of the conflicted file, its best effort at doing the combination. Places where Git was able to combine changes on its own already contain that combination. Places where Git found an ours-vs-theirs conflict have conflict markers and two, or even all three, input files' lines, depending on how you set merge.conflictStyle.

    I call these kinds of conflicts low level conflicts. (Git calls them that internally, sort of.) There are also what I call high level conflicts, such as when one side—ours or theirs—modifies and/or renames a file, and the other side deletes it.

    Using an extended option, -X ours or -X theirs, tells Git: when you hit a low-level conflict, just resolve it by taking ours or theirs respectively. This has no effect on high level conflicts: you must still resolve these manually.

    Note that low-level conflicts can occur even if the two changes don't both change the same line. For instance, if the original input says:

    line 1
    line 2
    line 3
    line 4
    

    and Alice changes 2 to two while Bob changes 3 to three, Git will call this a merge conflict. Using -X ours or -X theirs will discard one of the two changes. It's a good idea to actually test such merges before moving on. (Well, it's a good idea to test any merge: just because Git thought that it was OK to combine two different sets of changes, does not mean that it really was OK.)

    Recap

    The takeaways from the above—re-read through it if needed—are:

    • The -s strategy is in charge of all the work; we're talking here about -s recursive (though -s resolve does the same kind of thing).
    • A merge operation has three inputs: base = #1, ours or HEAD = #2, theirs = #3.
    • Git will combine unconflicted changes on its own, regardless of -X options.
    • Git will stop with high-level conflicts, regardless of -X options.
    • The -X options will favor either "ours" (#1-vs-#2) or "theirs" (#1-vs-#3) to resolve low-level conflicts.

    Cherry-pick

    We're now ready to look at what git cherry-pick really does. The action for a cherry-pick is often described as repeat the changes from a previous commit. While this captures the goal, it doesn't cover the mechanism. The mechanism is irrelevant up until a merge conflict occurs, and then suddenly it's terribly important.

    To talk about the mechanism, let's draw another commit graph fragment. This time, instead of Alice and Bob diverging from some common starting point H, let's just look at one or two programmers working on two different features, for instance:

    ...--P--C--N--O   <-- feature1
    
    ...--R--S--T   <-- feature2 (HEAD)
    

    Commit C is the child of parent commit P; commit N comes after C and O comes after P; these are all found through the name feature1.

    Commit T is the last commit on feature2, and we have branch feature2 checked out right now. So commit T is the HEAD commit.

    We need some new code to apply to T, and we realize: Wait, I just saw that code, or wrote it last week. It was in commit C! So we run git log to find the actual hash ID of commit C, then run:

    git cherry-pick <hash-of-C>
    

    to copy that commit.

    In order to do the copying—to find out what changed between parent commit P and child commit C—Git will run the same git diff --find-renames that we saw above with git merge. But that just gets their change. In order to apply their change to our commit, Git will first run another git diff --find-renames, this time comparing parent P with our current / HEAD commit T.

    In other words, Git runs:

    git diff --find-renames <hash-of-P> <hash-of-T>   # what we changed
    git diff --find-renames <hash-of-P> <hash-of-C>   # what they changed
    

    and now Git combines the changes, using the same merge engine as usual (-s recursive), and applies the combined changes to the snapshot in P. This preserves our work, and adds their change. Commit P becomes the merge base, and commit T is the --ours while C is the --theirs.

    Merge conflicts, if any occur, are because of these two git diff operations. If they do occur, index slot #1 contains files from the merge base P, slot #2 contains ours from T, and slot #3 contains theirs from T. The --ours option to git checkout makes sense, because T really is our commit. The -X ours option makes sense, because T is our commit.

    Rebase

    As mentioned above, the way git rebase works is to list out the commit hash IDs of some series of commits that need to be copied. Then it uses Git's detached HEAD mode to check out one particular commit. For illustration, let's draw a small rebase with just three commits to do:

           C--D--E   <-- branch (HEAD)
          /
    ...--B--F--G   <-- mainline
    

    Here, the commits we'd like copied are C, D, and E. The old base was commit B. Commits F and G got added to the mainline branch. So we run:

    git checkout branch
    git rebase mainline
    

    Git uses the current commit E and works backwards to find the three commits to copy, while using the name mainline and working backwards to find that commit B is the shared commit at which the copying stops. Then, Git uses the name mainline to get into detached HEAD mode:

           C--D--E   <-- branch
          /
    ...--B--F--G   <-- HEAD, mainline
    

    Git is now ready to copy commit C. Internally, at this point, Git runs git cherry-pick <hash-of-C> and git cherry-pick does its thing.

    If all goes well, the "merge" that cherry-pick runs works: Git compares base B with "our" commit G, compares base B with "their" commit C, combines the two differences on top of commit B, and makes a new commit that we will call C':

           C--D--E   <-- branch
          /
    ...--B--F--G   <-- mainline
                \
                 C'  <-- HEAD
    

    Git now repeats this with commit D. The "merge" uses commit C as its merge base, C' as --ours, and D as --theirs. Git combines the changes, applies the combined changes to existing commit C', and makes new commit D':

           C--D--E   <-- branch
          /
    ...--B--F--G   <-- mainline
                \
                 C'-D'  <-- HEAD
    

    Git now cherry-picks E: D is the merge base, D' is --ours, and E is --theirs, and the new commit completes the copying process:

           C--D--E   <-- branch
          /
    ...--B--F--G   <-- mainline
                \
                 C'-D'-E'  <-- HEAD
    

    With the copying done, git rebase now only needs to yank the name branch off the old tip commit E, and make it point to the commit that HEAD currently names, i.e., E', and re-attach HEAD to make everything look normal:

           C--D--E   [abandoned]
          /
    ...--B--F--G   <-- mainline
                \
                 C'-D'-E'  <-- branch (HEAD)
    

    Note what --ours means

    During the cherry-picking part of a rebase, --ours referred to:

    • commit G, at first
    • then commit C',
    • and then commit D'.

    So --ours refers first to their commit G, then to our own commits as built on the new branch.

    The --theirs commits were, in order, C, then D, then E. So --theirs refers to our commits, always.

    The merge base commits were, in order, B, then C, then D. There's no --base option to refer to these, but the first one was "their" commit and the other two were ours.

    If we want to override "their" (mainline) branch changes, then, we need to use --theirs, not --ours, most of the time.