Search code examples
gitgit-reset

Git branch diverged while there're no local change


Let me explain, I have a branch named "development" which I only read the code and make no change, I ran into branch diverged problem last week for no reason, to fix it, I have to run below commands:

git fetch origin 
git reset --hard origin/development

After this, git status did say that "Your branch is up-to-date with 'origin/development'".

//-----

Today, I did a fetch again, and ran git status, it gave:

Your branch and 'origin/development' have diverged, and have 45 and 51 different commits each, respectively.

I can understand that there're 51 new commits on the remote origin/development, because some of my colleagues commit and push codes to it.

But I just cannot understand why there're 45 local commits which don't exist on the remote, I've never changed anything on the local!

//-----

My first question is, do git reset --hard origin/development really give me a clean and up-to-date branch?

If so, then the best guess is that those 45 commits used to be on the remote but they're deleted by someone, right? Or are there any other possible reasons? I'm 101% sure that I've never changed anything on the local.

Is it normal to delete the existing commit history on the remote? What actions would cause the deleting of the existing commit history on the remote?

Thank you!


Solution

  • You're on the right track.

    I'm going to be a bit long winded here because I don't have time to make this shorter. :-) Here's how this happens, as an illustration. Remember that there are two (or more) repositories involved here, which I will just label "yours" and "theirs". Also, each single uppercase letter here stands for some commit—some git object with a 40-character SHA-1 "true name" like e59f6c2d348d465e3147b11098126d3965686098—and the lowercase os also stand for commits (each of which have their own unique 40-character "true names" but we don't have any reason to try to describe them, and with 40 or more commits we'd run out of uppercase letters).

    The initial setup

    At some point you did a git fetch or git fetch origin (or anything sufficiently similar). Your git called up their git over the Internet-phone and asked their git to package up any commits they had that you didn't. They did so and delivered you their commits, along with telling you that their branch-label development pointed to (contained the 40-character SHA-1 ID of) a commit I will draw in as B here. That commit had some parent commit or commits—I'll assume just one—which git records by storing its raw 40-character SHA-1 ID; the parent had another parent, and so on. This produces a chain:

    ... <- A <- X <- o ... <- o <- B   <-- [their "development"]
    

    This particular chain contains 45 commits, though I did not draw all of them (and there may be some branch-and-merge within the o sequence, but the key is that there's definitely one particular commit A and another particular commit B here). I also labeled one X; we'll see why in a while.

    Your git, having received all of this, does not make any of your branches point to commit B, because your branches are your branches, not to be messed-with until you ask. In order to remember that origin—i.e., their git—said development pointed to B, your git instead stores the ID of B under your "remote branch" label, origin/development:

    ... - A - X - o ... - o - B   <-- origin/development
    

    I've drawn the left-pointing arrows without arrow-heads this time just to make things fit better. In git, a commit always points back to its parents, because commits are permanent and can never be changed, so it's impossible for a parent to point to its children since the children are created after the parents.

    Once your fetch finished, though, you ran git reset --hard to make your (local) branch, development, also point to commit B:

    ... - A - X - o ... - o - B   <-- development, origin/development
    

    The break-up and re-union

    This brings us to just before your most recent git fetch when everything became rather strange. You ran git fetch again, so your git called up their git again and asked their git to send over whatever SHA-1 IDs they had that you didn't ... and this time they sent over 51 new commits (or maybe more, but 51 that apply here). We can draw in this new chain, which is now stored in your own repo, like this:

    ... - A - X - o ... - o - B   <-- development
           \
            Y - o - ... - o - o - C   <-- origin/development
    

    As before, your git changed their development to be your origin/development. Your git did not change any of your own local branches, so it left your development pointing to commit B.

    You're now in the state where you have 45 commits they don't (everything "before and up to B but after A"—we can write this in "git revspec" form as A..B), and they have 51 they just gave you (everything "before and up to C but after A").

    How did they get there? The answer is, they "rolled back" the 45 commits that they no longer have, and instead added 51 new commits. Precisely how they did it, and who did it, are not knowable, but we can make a very good guess.

    Check out the bold-font phrase above again: commits are permanent. You can't alter a commit. And yet, git has rebase -i (and other tools) that let you seem to alter old commits.

    These actually work by copying commits. You (as someone using git) identify a commit you want to "change", in this case, commit X. You instruct git to extract commit X, then you make some slight change—maybe not even a change to the source code, maybe just a change to the commit message—and you make a new commit Y. (A better name for this is X', indicating that it's a copy of X with some slight change, but I don't know for sure whether they did this, or simply discarded X and started the copying from the first o after X. You can do the latter quite easily in git rebase -i by deleting the pick line.)

    Once you have a commit copied-but-changed (or skipped and using the next commit), that commit itself has a new SHA-1 "true name", so every subsequent commit gets its own new ID as well. This makes a new chain that more or less parallels the old chain.

    What do you do next?

    In this case, you had no new commits of your own, so for you it's really simple: you just use git reset --hard again to point your development to commit C:

    ... - A - X - o ... - o - B   [abandoned]
           \
            Y - o - ... - o - o - C   <-- development, origin/development
    

    If you had your own commits on your development, you would have had a harder job, or at least, you would have if you wanted to continue cooperating with origin: you'd have to copy your commits, and only your commits, from your development to add copies onto the end of their commit C.

    (Git now has a nice way to do this semi-automatically, using --fork-point, but it's still a bit annoying and difficult. This is why it's generally bad for an "upstream" like origin to rewind-and-replace history: it forces everyone "downstream" to do extra work.)

    Aside: what happens to the "abandoned" commits?

    They stick around for a while, 30 days by default, findable through git's "reflogs". After that, their permanence goes away because the reflog that keeps them around expires. So it's not quite true that commits are permanent; instead, they're read-only, but get removed (garbage-collected) once they are "unreferenced".

    As long as you keep a visible reference (like a branch or tag name) pointing to them, though, they'll remain in your repository.

    This leads to a way to think about git commits without driving yourself crazy: the commits are permanent, but the labels move. For "normal" commits, the label simply moves to the newly-added commit. When you use a command like "git rebase" to "change history", git simply copies the old commits, then pastes the label on the end of the new chain of commits.

    (This is also how git commit --amend works: it doesn't change the final commit, instead it makes a new commit whose parent is the same as the parent of the old commit, and then moves the branch label. That is:

    ... - C - D   <-- label
    

    becomes:

            D     [abandoned]
           /
    ... - C - D'   <-- label
    

    If you close your eyes to D and ignore the little tick on D', it looks like you've changed the final commit.)