Search code examples
gitmergebranchrebaserevision-history

How to make a git branch the descendant of an other without changing its content?


I have two branches, 2 and 4:

2c              <- top of branch 2
|
2b
|
2a              <- root branch 2

and

4c             <- top of branch 4
|
4b
|
4a              <- root branch 4

The problem is:

Branch 4 root was actually derived from the top of branch 2 but this was done outside of git.

I would like to reflect that in git, by chaining these 2 branches, so that we can easily identify the differences between 2c and 4a, and also so that the merge of an other third branch that had similar changes would be easier for the merge algorithms.

The result should look like this:

4c             <- top of branch 4
|
4b
|
4a              <- root branch 4
| <================================== new connection between the two branches
2c              <- top of branch 2
|
2b
|
2a              <- root branch 2

After much reading about rebasing and changing a commit's parent, I used the following approach, which doesn't seem optimal:

    cd dev/project      # work locally, within the repo top folder: project
    git checkout 2
    git checkout -b 4L  # temporary branch to perform the link between the branches, based on the top of branch 2 
    git checkout 4
    git checkout 4a     #  retrieve the root of branch 4 in the working directory before to copy it
    cp -r . ../project-copy-4a  # save the work directory for later use
    git checkout -b 4L # return to temporary branch
    rm -rf *           # clean the working directory
    cp -r ../project-copy-4a/* ./  # incorporate the 4a content to simulate the developer     changes from status 2c
    git rm . ; git add .           # make sure to include the new files in git before commit
    git commit -m "connect branches 2 and 4"

I have now this status:

4La     (content equal to 4a)
|
2c              <- top of branch 2
|
2b
|
2a              <- root branch 2

In order to connect the branches, I will now rebase the branch 4 on the branch 4L , which top content is equal to the root of branch 4, hence not changing the contents in any of the branch 4 commits:

    git checkout 4  # move to branch 4
    git rebase 4L   # rebase it on branch 2, through branch 4L

I now have the target status (where 4x' could be a new commit hash from rebase but its content is equal to these of 4x):

4c '            <- top of branch 4
|
4b'
|
(4a')              <- root branch 4 (not sure git keeps this commit as it contains no changes)
|
4La
| <================================== new connection between the two branches
2c              <- top of branch 2
|
2b
|
2a              <- root branch 2

What would be a better procedure to achieve this in pure git commands, without resorting to file system commands like cp -r which can be heavy and risky, for example due to variations between the file system tree and the git files tree ?


Solution

  • I think you want to graft your history to have Git pretend that the parent of 4a is 2c:

    git replace --graft 4a 2c
    

    Then git filter-branch or git filter-repo to make your graft(s) permanent. This will rewrite your commits and changes their commit hashes; there's no way of avoiding that.

    This answer explains how to persist the grafted history. Replace the --all parameter with the commit range that you want to rewrite (e.g. 4a'..4c').


    The alternative is to create your commit objects manually from the existing trees:

    git checkout 2c
    id=$(git commit-tree -m 'commit message' -p 2c 4a^{tree})
    id=$(git commit-tree -m 'commit message' -p "$id" 4b^{tree})
    …
    git merge --ff-only "$id"
    

    The resulting history of both approaches is the same (but the commit-tree approach will use different author and committer dates, unless you manually take care of that).