Search code examples
gitversion-control

Merge two git repos after moving from gitlab to github


I'm facing a challenge with merging Git histories from two repositories and would appreciate some guidance. Here's the situation:

We moved our codebase from GitLab to GitHub about 5 months ago, but unfortunately, the Git history was not transferred along with the repository. The GitLab repository's history ends on July 1, 2023, and the GitHub repository picks up from the same date without the prior history. Since the move, significant new commits have been made to the GitHub repository. I attempted to merge the histories to create a seamless history in the GitHub repository but faced several issues.

My Goal: I want to integrate the old history from the GitLab repository into the GitHub repository so that the entire history is accessible and browsable in one place, as if it had always been a single repository.

What I Tried:

Rebasing: I tried rebasing the new GitHub history onto the old GitLab history. However, this proved to be extremely cumbersome due to a large number of conflicts and the complexity of the history. Cherry-Picking: I also attempted to cherry-pick the commits, but ran into problems with "bad revision" errors and complexities in handling merge commits.


Solution

  • If you don't mind rewriting all of the "new" (on GitHub) history, you can use the replace mechanism:

    git replace --graft "$first_commit_on_github" "$last_commit_on_gitlab"
    

    and then filter-branch to make the grafts permanent (this step rewrites the history):

    git filter-branch --tag-name-filter cat -- ^"$replaced_commit" --all
    

    If you want to avoid rewriting the history, there could be a workaround. This is your current graph, with two unrelated histories:

    A-B-C           < gitlab
            D-E-F   < github
    

    First, introduce an "ours" merge to connect the histories:

    git checkout -b combining D
    git merge --ours C
    

    This will create a new commit D' which has the exact same tree as D. (From your question it sounds like C and D should be identical anyway)

    A-B-C              < gitlab
         \
          D'           < combining
         /
        D-E-F          < github
    

    Then merge the rest of the github history:

    git merge github
    

    Creating the following history:

    A-B-C             < gitlab
         \
          D'--M       < combining
         /   /
        D-E-F         < github
    

    Finally, rename the combining branch to whatever name makes sense for you or fast-forward github to combining (at that point, github is a subset of combining, so it can be fast-forwarded without problems).

    The approach is non-destructive and should avoid any merge conflicts.