Search code examples
gitgit-cherry-pick

Why extra changes in git cherry-pick conflict?


For git cherry-pick resulting in a conflict, why does Git suggest more changes than just from the given commit?

Example:

-bash-4.2$ git init
Initialized empty Git repository in /home/pfusik/cp-so/.git/
-bash-4.2$ echo one >f
-bash-4.2$ git add f
-bash-4.2$ git commit -m "one"
[master (root-commit) d65bcac] one
 1 file changed, 1 insertion(+)
 create mode 100644 f
-bash-4.2$ git checkout -b foo
Switched to a new branch 'foo'
-bash-4.2$ echo two >>f
-bash-4.2$ git commit -a -m "two"
[foo 346ce5e] two
 1 file changed, 1 insertion(+)
-bash-4.2$ echo three >>f
-bash-4.2$ git commit -a -m "three"
[foo 4d4f9b0] three
 1 file changed, 1 insertion(+)
-bash-4.2$ echo four >>f
-bash-4.2$ git commit -a -m "four"
[foo ba0da6f] four
 1 file changed, 1 insertion(+)
-bash-4.2$ echo five >>f
-bash-4.2$ git commit -a -m "five"
[foo 0326e2e] five
 1 file changed, 1 insertion(+)
-bash-4.2$ git checkout master
Switched to branch 'master'
-bash-4.2$ git cherry-pick 0326e2e
error: could not apply 0326e2e... five
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
-bash-4.2$ cat f
one
<<<<<<< HEAD
=======
two
three
four
five
>>>>>>> 0326e2e... five

I was expecting just the "five" line between the conflict markers. Can I switch Git to my expected behavior?


Solution

  • Before we get any further, let's draw the commit graph:

    A   <-- master (HEAD)
     \
      B--C--D--E   <-- foo
    

    or, so that you can compare, here's the way Git draws it:

    $ git log --all --decorate --oneline --graph
    * 7c29363 (foo) five
    * a35e919 four
    * ee70402 three
    * 9a179e6 two
    * d443a2a (HEAD -> master) one
    

    (Note that I turned your question into a command sequence, which I've appended; my commit hashes are of course different from yours.)

    Cherry-pick is a peculiar form of merge

    The reason you see the somewhat pathological behavior here is that git cherry-pick is actually performing a merge operation. The oddest part about this is the chosen merge base.

    A normal merge

    For a normal merge, you check out some commit (by checking out some branch which checks out the tip commit of that branch) and run git merge other. Git locates the commit specified by other, then uses the commit graph to locate the merge base, which is often pretty obvious from the graph. For instance when the graph looks like this:

              o--o--L   <-- ours (HEAD)
             /
    ...--o--B
             \
              o--o--R   <-- theirs
    

    the merge base is simply commit B (for base).

    To do the merge, Git then makes two git diffs, one from the merge base to our local commit L on the left, and their commit R on the right (sometimes called the remote commit). That is:

    git diff --find-renames B L   # find what we did on our branch
    git diff --find-renames B R   # find what they did on theirs
    

    Git can then combine these changes, applying the combined changes to B, to make a new merge commit whose first parent is L and second parent is R. That final merge commit is a merge commit, which uses the word "merge" as an adjective. We often just call it a merge, which uses the word "merge" as a noun.

    To get this merge-as-a-noun, though, Git had to run the merge machinery, to combine two sets of diffs. This is the process of merging, using the word "merge" as a verb.

    A cherry-pick merge

    To do a cherry-pick, Git runs the merge machinery—the merge as a verb, as I like to put it—but picks out a peculiar merge base. The merge base of the cherry-pick is simply the parent of the commit being cherry-picked.

    In your case, you're cherry-picking commit E. So Git is merging (verb) with commit D as the merge base, commit A as the left/local L commit, and commit E as the right-side R commit. Git generates the internal equivalent of two diff listings:

    git diff --find-renames D A   # what we did
    git diff --find-renames D E   # what they did
    

    What we did was to delete four lines: the ones reading two, three, and four. What they did was to add one line: the one reading five.

    Using merge.conflictStyle

    This all becomes somewhat clearer—well, maybe somewhat clearer 😅—if we set merge.conflictStyle to diff3. Now instead of just showing us the ours and theirs sections surrounded by <<<<<<< etc., Git adds the merge base version as well, marked with |||||||:

    one
    <<<<<<< HEAD
    ||||||| parent of 7c29363... five
    two
    three
    four
    =======
    two
    three
    four
    five
    >>>>>>> 7c29363... five
    

    We now see that Git claims that we deleted three lines from the base, while they kept those three lines and added a fourth.

    Of course, we need to understand that the merge base here was the parent of commit E, which is if anything "ahead of" our current commit A. It's not really true that we deleted three lines. In fact, we never had the three lines in the first place. We just have to deal with Git showing things as if we had deleted some lines.

    Appendix: script to generate the clash

    #! /bin/sh
    set -e
    mkdir t
    cd t
    git init
    echo one >f
    git add f
    git commit -m "one"
    git checkout -b foo
    echo two >>f
    git commit -a -m "two"
    echo three >>f
    git commit -a -m "three"
    echo four >>f
    git commit -a -m "four"
    echo five >>f
    git commit -a -m "five"
    git checkout master
    git cherry-pick foo