TL;DR: If I fetch remote changes into a local git repo, then do a merge, and some time later I fetch some new changes, but this time I do rebase instead of merge, then the previously created merge commit disappears. Why?
Example
Consider the following starting point, created by the command git log --all --graph --decorate --oneline
:
* 28992d3 (repo1/master) hello4
* 3610bdf hello3
| * f113d63 (HEAD -> master) bye-bye
| * cabc896 bye
|/
* 75f7ca9 hello2
* 525cb4a hello1
I.e., there is a git repo, with a master
branch with some local, unpushed changes. Some other changes have been just fetched from the remote (in this case repo1
).
Next command: git merge repo1/master
. Result:
* b94aa29 (HEAD -> master) Merge remote-tracking branch 'repo1/master'
|\
| * 28992d3 (repo1/master) hello4
| * 3610bdf hello3
* | f113d63 bye-bye
* | cabc896 bye
|/
* 75f7ca9 hello2
* 525cb4a hello1
Now let's say there are some new commits both locally, as well as in the remote repo1
, and then, again the remote contents are fetched from repo1
, via git fetch repo1 master
. The result looks like this:
* 2e3d749 (repo1/master) hello6
* b17983d hello5
| * 2e49819 (HEAD -> master) see ya
| * c2f2d5a good-bye
| * b94aa29 Merge remote-tracking branch 'repo1/master'
| |\
| |/
|/|
* | 28992d3 hello4
* | 3610bdf hello3
| * f113d63 bye-bye
| * cabc896 bye
|/
* 75f7ca9 hello2
* 525cb4a hello1
So far so good.
Now let's do git rebase repo1/master
, and the result is a nice, linear commit log:
* 101e524 (HEAD -> master) see ya
* 3ce7543 good-bye
* 849cbd4 bye-bye
* 483bab8 bye
* 2e3d749 (repo1/master) hello6
* b17983d hello5
* 28992d3 hello4
* 3610bdf hello3
* 75f7ca9 hello2
* 525cb4a hello1
Question: where did the commit b94aa29 Merge remote-tracking branch 'repo1/master'
go? (As far as I see it was not preserved even as a "dead" commit, like e.g. doing commits in detached head.)
Remarks:
b94aa29
anymore, because we will have all its contents anyway", but can you please explain more in detail what is going on? And also, is this always true, that rebasing on a previously merged branch will throw away all merge commits?git rebase
functionally means:
git cherry-pick
;The copying literally can't copy merges, so it usually doesn't bother trying.
The general idea here is to take a series of commits:
A--B--C--D <-- topic (HEAD)
/
...--o--o--*--o--o <-- mainline
and transplant them to a series of new-and-improved commits:
A--B--C--D [abandoned]
/
...--o--o--*--o--o <-- mainline
\
A'-B'-C'-D' <-- topic (HEAD)
The "improvement" is to base the new chain on the tip of some other branch, such as mainline
. To make this happen, Git literally must copy the original commits—A-B-C-D
, here—to different commits that have different hash IDs, because every commit, once made, is permanent1 and set in stone; a commit that is even one single bit different gives you a new, different commit hash ID, even if the only difference is the parent ID stored in the new commit. So even if the source tree in snapshot A'
matches the source tree in snapshot A
—and it probably doesn't—the commit ID for A'
is different from the commit ID for A
.
(This carries on through the rest of the commits as well, of course.)
The arguments you give to git rebase
select:
Normally you can get away with a single name for both of these. For instance, git rebase mainline
means to put the copies after the commit to which mainline
points, and to copy those commits that are reachable from the commit to which topic
(the current branch name) points—i.e., D
—excluding any commits reachable from the tip of mainline
. The first commit that's not copied is commit *
, where the two branches rejoin (in this case forever).
In some cases, you may need to use git rebase --onto
to separate the two notions. With --onto
, you tell rebase where to put the copies, freeing up the remaining argument to mean what not to copy. That's not required here.
There are a bunch of kinds/flavors of rebase: git rebase
with no arguments uses git format-patch | git am
to copy commits, rather than actually running git cherry-pick
, while git rebase -i
actually uses git cherry-pick
. (In older versions of Git, git rebase -i
is a shell script that literally runs git cherry-pick
. To make it faster for Windows, git rebase
was modified so that -i
is built in to Git's sequencer, which is code that implements both cherry-pick and revert.)
Note that all this copying, which goes one at a time, ends up building a linear chain of commits. This happens even if the inputs might include a merge, as in:
A--B--M--C--D <-- master
/ /
...--o--*--o--S------o--T <-- repo1/master
You now ask Git to rebase (i.e., copy) some commits—in this case, some commits that are on master
—with the --onto
target being T
, and the limit being *the first commit reachable from T
/ origin/master
that is also on master
, which is commit *
.
The complete list of such commits is A
then B
then M
then C
then D
. But how should Git copy M
? If it tried, the result might look a lot like:
A--B--M--C--D [abandoned]
/ /
...--o--*--o--S------o--T <-- repo1/master
\
A'-B'-M'-C'-D <-- master (HEAD)
/
???----------
except M'
, to be a merge, needs to have two parents. What other parent should it have? If its other parent is S
, well, that's possible, but what value does it bring?
(The point of a merge is to combine changes in two different lines of development. Since A'
is based on T
which is based on S
, A'
already includes whatever was in S
and there is no need to merge it.)
In general, Git simply omits the merge commits entirely here, so it ends up copying just A-B-C-D
. Note that if you rebase something containing an internal merge, the same thing happens: Git simply copies both "sides" of the merge, linearizing the result:
C--D
/ \
A--B M--G <-- topic (HEAD)
/ \ /
/ E--F
/
...--o--*--o--o <-- mainline
Here git rebase
will copy A-B-C-D-E-F-G
or perhaps A-B-E-F-C-D-G
, removing M
and flattening the topology.
There is a -p
flag to git rebase -i
, which has a longer spelling --preserve-merges
, but it doesn't actually preserve the merges (nor cherry-pick them, which is impossible). Instead, it makes new merges (by running git merge
). This is quite tricky, but can be used to rebase the above A-B-(C-D, E-F)-M-G
topology. Note that if you resolved merge conflicts in M
, you will have to resolve them again when Git makes a new merge M'
that merges D'
and F'
(git rerere
may be useful here).
1Permanent, that is, until the entire commit has been abandoned long enough for Git to be sure that no one wants it; then it gets cleaned away by git gc
.