Is there any difference between the number of conflicts when doing merge to a branch as opposed to rebase a branch? why is that?
When doing a merge the merging changes are stored in the merge commit itself (the commit with the two parents). But when doing a rebase, where is the merge being stored?
Thanks, Omer
A rebase is (mostly) just a series of cherry-picks. Both a cherry-pick and a merge use the same logic — what I call "merge logic", and what the docs usually call a "3-way merge" — to create a new commit.
That logic is, given commits X and Y:
Start with an earlier commit. This is called the merge base.
Make a diff between the earlier commit and X.
Make a diff between the earlier commit and Y.
Apply both diffs to the earlier commit, and:
a. If you can do that, make a new commit expressing the result.
b. If you can't do it, complain that you've got a conflict.
In this respect, merge and cherry-pick (and therefore merge and rebase) are almost the same thing, but there are some differences. One extremely important difference in particular is who the "3" are in the logic of the "3-way merge". In particular, they can have different ideas about who the "earlier commit" is in the first step (the merge base).
Let's take first a degenerate example where merge and cherry-pick are almost identical:
A -- B -- C <-- master
\
F <-- feature
If you merge feature into master, Git looks for the commit where feature and master most recently diverged. That is B. It is the "earlier commit" in our merge logic — the merge base. So Git diffs C with B, and diffs F with B, and applies both diffs to B to form a new commit. It gives that commit two parents, C and F, and moves the master
pointer:
A -- B - C - Z <-- master
\ /
\ /
F <-- feature
If you cherry-pick feature onto master, Git looks for the parent of feature, meaning the parent of F. That is B again! (That's because I deliberately chose this degenerate case.) That is the "earlier commit" in our merge logic. So once again Git diffs C with B, and diffs F with B, and applies both diffs to B to form a new commit. Now it gives that commit one parent, C, and moves the master
pointer:
A -- B - C - F' <-- master
\
F <-- feature
If you rebase feature onto master, git does a cherry-pick of each commit on feature and moves the feature
pointer. In our degenerate case there is just one commit on feature:
A -- B - C <-- master
\ \
\ F' <-- feature
F
Now, in those diagrams, it happens that the "earlier commit" that serves as the merge base is the same in every case: B. So the merge logic is the same, so the possibility of a conflict is the same, in every diagram.
But if I introduce more commits on feature, things change:
A -- B -- C <-- master
\
F -- G <-- feature
Now, to rebase feature onto master means to cherry-pick F onto C (giving F') and then to cherry-pick G onto that (giving G'). For that second cherry-pick, Git uses F as the "earlier commit" (the merge base), because it is the parent of G. This introduces a situation we have not considered before. In particular, the merge logic is going to involve a diff from F to F', along with a diff from F to G.
So when we rebase, we iteratively cherry-pick each commit along the rebased branch, and on each iteration the three commits being compared in our merge logic are different. So clearly we introduce new possibilities for a merge conflict, because, in effect, we are doing many more distinct merges.