Search code examples
gitmergegit-merge

What happens in git when you merge unrelated histories?


I am dealing with unrelated histories and I know I can force git to merge them by using --allow-unrelated-histories. However, this option is disabled by default. It must be for a reason, and I am having a hard time finding any documentation about what happens when you use that option and why it is disabled by default.

What does this option do exactly and why would I not want to use it?


For example, say I create a new project and the following happens:

I generate two commits and push them to the remote

master        : a -> b
origin/master : a -> b

Then, while I work on more commits, the commits on the remote are modified or replaced (I know this sounds horrible, but it is a very valid workflow when using Gerrit and applying changes to patchsets)

master        : a -> b -> c -> d -> e
origin/master : a'-> b'

What happens if I do git pull --allow-unrelated-histories? What does the tree look like? Does my local branch stack the commits or is there conflict resolution or something else?


Solution

  • Normally, when you do a merge, Git considers three (and only three) points: the two heads and the merge base, which is usually the most recent common ancestor. Those three points are required for a three-way merge, which Git does by basically applying the changes from each side (when computed against the merge base) into a common tree.

    However, when you merge unrelated histories, the common ancestor is the empty tree, so you can end up with a lot of add/add conflicts and general sadness if you try to do this. Usually it's unintentional when this happens and is a mistake, so there's an option to enable this if you do really want it (which you do here).

    The best way to do a merge in this case is to try to merge two identical trees, or at least two trees which differ from each other in the minimal amount possible. If two files are identical (that is, their blobs have the same object ID) in the two heads, Git will trivially merge them by taking that result, so there are by definition no merge conflicts for identical files or identical trees.

    You're likely going to run into the same problem with a rebase, since either rebase applies patches or does merges under the hood, and you're going to run into similar problems either way, so a merge is probably your best choice here.

    Your branch, when you're done merging, will still have two roots, so you'll end up with something like this (where f is your merge commit):

    a  - b  - c - d -  e - f
    a' - b' -------------/