Search code examples
gitrebase

Git rebase with forked remote


There is a main git repo; main-repo; master branch - my-remote.
I forked that to my profile: main-repo; master branch - origin

Added a commit to remote after it was forked.
Added a commit to origin after forking. So now this is one commit ahead and one commit behind.

Goal: Rebasing my-remote to origin. In such a way that origin should have commit from my-remote's first and then from origin. I understand that this will create new commits(SHA).

I am cloning from my profile.
git clone url2. - origin

Added remote repo as another remote.
git remote add my-remote gitRepoToRemote

I am on my local master branch which is tracking origin.
Trying to rebase.
git fetch --all.
git rebase my-remote/master.

Got conflicts.
Resolved manually.
then git add file-with-conflict.
continue with rebase git rebase --continue.
git push origin HEAD:master.
Getting error. Why am I getting this error when I resolved conflict before pushing?

 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to 'gitRepoToOrigin'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Integrate the remote changes (e.g.
hint: 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details

Solution

  • As implied (but not stated) in comments, you will need some kind of forced push. The reason is not that complicated, but it helps a lot to draw the reason, rather than talk about it. The problem here, to put it in a funny way, is that you have three different repositories all named Bruce, one new commit named Bruce, and one rewrite of that new commit named Bruce. You now need to tell Bruce to forget about Bruce because Bruce has a new Bruce and Bruce needs to match up with Bruce.

    (The above is a bit of overkill, but see also the had had had had had sentence.)

    Instead of calling everyone and everything "Bruce", or trying to give the exact hash ID of each commit—which causes similar (though distinct) problems because humans are bad at hash IDs, converting each one to a "Bruce"-like bleep in their heads—let's draw each repository as a series of commits with the commits designated by single uppercase letters. We start with what you called the "main-repo" at first, then called my-remote later when you used git remote add:

    my-remote:
    
    ...--F--G--H   <-- master
    

    The name master exists in this repository. That name stores the hash ID of commit H (some big ugly hash ID). Commit H, at the right, is the newest. It contains:

    • a full snapshot of every file, and
    • metadata that include the hash ID of earlier commit G

    so commit H links backwards to earlier commit G. Earlier commit G has a snapshot of every file (from earlier), of course, plus the hash ID of still-earlier commit F. This repeats all the way back to the beginning of time: this is how Git stores commits.

    You then used GitHub's fork button to make a copy of this repository, which we will call origin. This copy has its own branch names but shares the actual commits with the original repository.1 In any case, in this copy over on GitHub, the name master holds the same hash ID:

    origin:
    
    ...--F--G--H   <-- master
    

    You then cloned the GitHub fork to your own computer (we'll call this "laptop", even if it's a desktop or deskside computer, just to give it a memorable but distinct name). This copied the commits, but the copies are identical to the originals—they have to be; no part of any commit can ever be changed after it's made—and then copied their branch names to remote-tracking names, so that you had:

    laptop:
    
    ...--F--G--H   <-- origin/master
    

    Your git clone command then created a new master branch in your laptop repository, based on your -b master argument, or your lack of any -b argument plus the recommendation from GitHub that your Git use the name master here:

    laptop:
    
    ...--F--G--H   <-- master, origin/master
    

    and finally, on your laptop, your Git ran git checkout master to populate a working tree (and Git's index though we won't cover this here) from commit H, making master the current branch and commit H the current commit, on laptop:

    laptop:
    
    ...--F--G--H   <-- master (HEAD), origin/master
    

    1Whether they share the raw underlying commits or copy them isn't really relevant to us. The fact is that the commits are 100%, bit-for-bit identical. This means that as long as both my-remote and origin are stored on the same physical machine, they can share the underlying commits. If they're stored on separate physical machines with separate permanent memory systems (spinning disks, flash, whatever), they'll need to use copies of the raw commit bytes, rather than literally sharing the underlying storage, but we don't care about that.


    Summary so far

    As of this time (some time in the past), the contents of all three repositories looks very similar. The chief—and observable—difference in your laptop repository is that you have origin/master, and a working copy of commit H. Other than these differences (important for getting work done, but otherwise irrelevant), all three look like this:

    ...--F--G--H   <-- master
    

    Things change over time

    Next, though, you made a new commit on your laptop. We'll call this commit I, the next letter after H. So on laptop we now have:

    laptop:
    
    ...--F--G--H   <-- origin/master
                \
                 I   <-- master (HEAD)
    

    Meanwhile, someone sent a new commit to master over on GitHub for the my-remote repository. Normally I would call this commit J but for no apparent reason, I will call it K here. So they have this, which I will draw oddly for no apparent reason either:

    my-remote:
                 K   <-- master
                /
    ...--F--G--H
    

    Note how the repository named origin has not changed at all yet.

    At this point, you run git push on your laptop. Without worrying too much about exactly which arguments you pass to git push (if any) here, this winds up sending the one commit that your laptop has, that origin does not, to origin. So now, on origin, you have:

    origin:
    
    ...--F--G--H
                \
                 I   <-- master
    

    When this push succeeds, your laptop updates its origin/master. That's because your git push had your laptop call GitHub, over the internet, asking GitHub to connect to repository origin (which worked). Your laptop Git sent the data that make up commit I, then asked GitHub to set their master, and they said OK, so your laptop Git knows that the origin Git has its master pointing to commit I. So now on laptop you have:

    laptop:
    
    ...--F--G--H
                \
                 I   <-- master (HEAD), origin/master
    

    We don't need all these little kinks in the graphs we're drawing here, but I'm using them on purpose. Note that neither origin nor laptop know anything about commit K yet! Similarly, my-remote has never heard of commit I.

    The fun (?) begins

    At this point you start doing something fancy:

    Added remote repo as another remote. git remote add my-remote gitRepoToRemote

    Your laptop Git now knows that another Git, named my-remote, exists. It has not yet connected to it.

    git fetch --all
    

    Most people misuse --all, but here you have used it correctly (congratulations! 😀). The --all option directs your Git to run git fetch against each remote.

    This was a bit of overkill, because this had your Git call up both origin and my-remote, to see what was new, if anything, in those two repositories. Nothing was new in origin, and nothing could be, because you would be the only one updating it. But that's OK! When your Git called up the Git called my-remote, though, something was new: they had commit K, which you didn't.

    Your Git therefore obtained commit K from my-remote, and stuck it in your repository. Your Git then created—as this was the first contact with my-remote—a remote-tracking name, my-remote/origin. So now laptop has:

    laptop:
                 K   <-- my-remote/master
                /
    ...--F--G--H
                \
                 I   <-- master (HEAD), origin/master
    

    (and now you can see why I kept the kink in the graph drawing).

    You could tell your Git to tell the origin Git about commit K (sending a copy of K to origin in the process2</sup). But, as you can see from this drawing of what things look like on laptop, there would be a problem at origin: the name master can only point to one commit. You can pick either commit I or commit K, but not both. To point to both commits—so that you can find I and K—you need at least two names.

    Your repository, on laptop, has the necessary names: my-remote/master points to K, and master points to I. origin/master also points to I, so you're free to have your master point anywhere now: to find commit K you can use the name my-remote/master, and to find I you can use the name origin/master.


    2Since origin is a fork of my-remote, origin could in theory be able to see commit K more directly. In theory, GitHub could arrange for this. If, how, and when they ever do that is up to them: your Git does not need to know anything about it. If, how, and when GitHub ever add a web mechanism, instead of this somewhat roundabout fetch-and-push-again method, to introduce commit K into origin, is likewise up to GitHub. As of today, though, you must use the roundabout method.


    Rebase

    The rebase command can be summarized this way: I have some commit(s) that I mostly like, but there's something about those commits that I don't like. I know that I cannot change anything about any existing commit, so instead of doing that, I'd like to copy the existing commits to new-and-improved ones, in which the things I dislike are changed, and the things I like are retained.

    Let's look again at what we have on laptop at this point:

    laptop:
                 K   <-- my-remote/master
                /
    ...--F--G--H
                \
                 I   <-- master (HEAD), origin/master
    

    What we don't like about commit I are two things:

    • it doesn't have the updates from commit K, and
    • it comes after commit H, instead of after commit K.

    What we do like about it are these things:

    • it has some changes from commit H, and
    • it has a carefully crafted, delightfully detailed commit message explaining why it has these changes.

    We would like to retain those two things while making a new commit that repairs the two things we don't like. We do this with git rebase (which, internally, consists of cherry-picking each of the commits we'd like to copy; there's only one, which is why we just get one new commit).

    Without worrying too much about how that goes—it sometimes gets messy:

    Got conflicts. Resolved manually. then git add file-with-conflict. continue with rebase git rebase --continue.

    but it looks like you did all of this correctly—the end result is a new commit:

    laptop:
                   J   <-- master (HEAD)
                  /
                 K   <-- my-remote/master
                /
    ...--F--G--H
                \
                 I   <-- origin/master
    

    This new commit, J, comes after existing commit K, and makes the right changes—modified as needed due to the now-resolved conflicts—and carries the right commit message text. So new commit J is the one we want. It completely obsoletes old commit I.

    How do we know that J replaces I, and that I is no longer useful? The only answer here is that we know it because we did it. Git doesn't know or care. Only we know and care. Our Git, on laptop, has our master pointing to commit J now. Commit I has been abandoned, except for the fact that our origin/master can still find it.

    We now want to tell origin to do the same

    We can now run:

    git push origin master
    

    or:

    git push origin HEAD:master
    

    to have our Git call up GitHub, have them connect to the origin repository, and send them new commit J. This part works—but then we ask them to set their master to point to commit J, and they say no. They say no specifically with the error that says If I do that, I will abandon some existing commit:

     ! [rejected]        master -> master (non-fast-forward)
    

    This non-fast-forward message means I will lose some commit(s) off my master.

    The commit they will lose, in this case, is commit I. Commit I is obsoleted by commit J. But they don't know that. We have to force them to abandon commit I.

    The way we do that is to use one of the various forcing options (rather appropriate for Star Wars Day):

    git push --force origin master
    

    or:

    git push --force-with-lease origin master
    

    There is a difference between these, but in this particular case (origin is "your" fork, controlled by you, not being pushed-to by anyone else) it doesn't really matter. Both of them work like regular git push except that, in the end, instead of sending a polite request: please, if it's OK and doesn't lose anything, update your master to remember commit J now, this kind of push ends with a command of the form update your master to point to commit J! The Git at GitHub will obey this command and leave, in the origin repository, this:

    origin:
                   J   <-- master
                  /
                 K
                /
    ...--F--G--H
                \
                 I   ???
    

    The Git over at GitHub will eventually—this may take a day or a week or more—run git gc and remove commit I, and we should now start drawing its repository as:

    origin:
    
    ...--F--G--H--K--J   <-- master
    

    to simplify it. We can also start doing that with your laptop, except that we need my-remote/master to point to K, still, so we get:

    laptop:
                    J   <-- master (HEAD), origin/master
                   /
    ...--F--G--H--K   <-- my-remote/master
    

    Your laptop Git actually hangs on to commit I for quite a while—at least 30 days by default—before throwing it out, but since it's not visible (in git log output) we don't have to bother drawing it in, at this point.