My problem is to do cherry-pick
in GitPython. I couldn't find this command and decided that I have to do that in some other way.
Also, I'm just interested in how it works internally.
I understand cherry-picking commit A
as trying to apply diff between A^
and A
to HEAD
. But I suspect it can be expressed in terms of merges somehow. That's why I ask for plumbing commands.
I tried to find something like git-cherry-pick.sh
in Git repo on GitHub but couldn't find anything but tests and documentation.
Cherry-picking is a fundamental building block that has no non-fundamental equivalent. That is, there's no lower level operation that's "pure plumbing". The reason is that cherry-pick does a merge with a (potentially, at least) somewhat screwy merge base.
That said, with proper inputs, git apply
implements cherry-picking when run as git apply -3
(but git apply
is not a plumbing command either). The way this works is using the Index:
lines in each git diff
. The Index:
lines provide the otherwise-missing merge base information. There is still one thing that is different here though, having to do with rename detection.
If there are no renames, the two are equivalent. This is because a merge operation has one key difference from a simple patch: a merge has a merge base, from which we can derive two patches.
Consider the following sequence:
Alice and Bob start with a common Git repository, with file readme.txt
in some commit.
Alice changes line 10 so that instead of saying "bees are purple", it says "bees are green". She also changes line 9 so that the file says "Everything below is bizarre." (And then, of course, Alice commits the new files.)
Bob changes line 10 so that instead of saying "bees are purple", it says "bees are green", and also adds a new line 20 so that it adds a claim that "submarines climb trees."
Now, if Alice gets Bob's change as a patch (without an Index:
line, as just a contextual diff, e.g., from diff -U
) and feeds that into her Git, Alice's Git won't know what to do with Bob's change to line 10. It will have no problem with the line-20 addition, but the context for the "bees are green" change doesn't match: it doesn't have the "bizarre" bit.
If, on the other hand, Alice get's Bob's change as a "cherry-pick-able patch" (either by running an actual git cherry-pick
or by getting a diff with an Index:
line and using git apply -3
or equivalent), Alice's Git now has more information. Alice's Git can now see not just that Bob changed readme.txt
, but which version of that file he had when he started. Specifically, the Index:
line has the blob hash of the "before" version of readme.txt
, and since Alice and Bob started with the same version of the file, in the same commit. (It also has Bob's "after" version, which Alice doesn't have, but now the entire "after" version can be constructed if necessary—but it's unnecessary.)
Now Alice's Git can run its own diff: it can diff the base version against Alice's current version, to see what Alice did. Then it could diff the base version against Bob's version, to get the patch-with-base that it already has (but why bother? that's the patch it already has!). Now it can (try to) combine the two patches: it sees that Bob's change to line 10 is redundant—it's contained within Alice's own changes—and concentrates only on line 20. Now Alice's Git can apply the patch.
That's what a merge base is (and does) for a file. The rename case comes in when—and for Git, only when—Git can diff an entire tree, i.e., it needs the commit-as-a-whole (or at least the tree
object attached to the commit). Here git apply
will run out of its depth since it works on one file at a time. (The git am
code might be able to deal with it if the incoming patch has "rename" instructions within it, but I don't think that's in Git, though I admit to not having looked lately.)