Search code examples
gitvisual-studio-codegitlens

GITLens switch to commit made my newer commits disappear


very scary situation right now: I have used the GitLens extension of VSCode to jump back to an older commit. I wanted to checkout the commit, located it in the COMMITS sidebar, right clicked and selected Switch to Commit.... I did expect to checkout to that commit, and then be able to check out back to my current state.

Now running git logshows me the log of my commits only up to the point of the commit that I have selected. This is scary. Where are my newer commits?

As it is now I cannot locate my newer commits and go back to them. I have made a new commit just before switching to the older commit, so I am 100% certain there should be newer commits. This is a new project that I have not committed to any remote location yet, so git pullcannot bail me out.

I really hope someone can help me, I do not want to loose 2 days of work...


Solution

  • This is scary, to those new to Git. But don't worry: all the commits are still there.

    Various GUIs, including Visual Studio, block access to Git (which could be good or bad, depending on your point of view) so that you can't see what's really going on, and I don't use these GUIs, because they keep you from seeing what's going on, so I can't say what, precisely, each clicky button in your GUI does. Git, however, works like this:

    • There is, at all times,1 a current commit. Git has a special name for this commit: HEAD, written in all uppercase just like this.2

    • At most times, there is also a current branch. Git has a special name by which you can access this current branch: HEAD.

    You might—in fact, you should—object at this point: how do we know whether HEAD refers to the commit or to the branch name? Git's answer is: I pick one or the other based on whichever one I want at the moment. Some things need a branch name, in which case, HEAD turns into the branch name. Some things need a commit, in which case HEAD turns into the commit. Basically there are two internal ways Git has to ask what's the HEAD now. One gives a branch-name answer, like master or main or whatever, and the other gives you a raw commit hash ID.

    OK, so, with this in mind, we now remember that git log prints out the log like this:

    commit eb27b338a3e71c7c4079fbac8aeae3f8fbb5c687 (...)
    Author: ...
       ...
    
    commit fe3fec53a63a1c186452f61b0e55ac2837bf18a1
    ...
    

    That is, we see all these weird hash IDs spill out, one at a time. The hash IDs are the actual, true-names of each commit. Each commit gets a globally-unique hash ID: no two different commits are ever allowed to have the same one. That's why the hash IDs are so big and ugly. They look random. They aren't actually random, but they are unpredictable.3

    A branch name like main translates to a commit hash ID. A raw hash ID already is a hash ID. Either way, given the right hash ID, Git can find the commit.

    Each commit holds a full snapshot of every file,4 plus some metadata: information about the commit itself, such as who made it, and when, and a log message they can write at the time. Crucially for Git itself, one item in this metadata is the raw hash ID of the previous commit.

    There's one other random fact about commits that is useful to remember here: Once made, no part of any commit can ever be changed. That's how the hash IDs actually work, and it's critical to Git being a distributed version control system. But it also means that no Git commit can ever contain the raw hash ID of its future children commits, because we have no idea what those will be when we create the commit. Commits can store the "names" (hash IDs) of their parents, because we do know their ancestry when we create the children.

    What this means for us here is that the commits remember their parents, which forms a sort of backwards-looking chain. All we have to do is remember the raw hash ID of the latest commit. When we do that, we end up with a chain that we can draw like this:

    ... <-F <-G <-H   <--main
    

    Here, the name main holds the real hash ID of the latest commit, which for drawing purposes, we just call H. Commit H in turn holds the hash ID of earlier commit G, which holds the hash ID of still-earlier commit F, and so on.

    We can now see how git log works: it starts with the current commit, H, as selected by the current branch, main. To make main be the current branch, we attach the special name HEAD to the name main:

    ...--F--G--H   <-- main (HEAD)
    

    Git uses HEAD to find main, uses main to find H, and shows us H. Then Git uses H to find G and shows us G; then it uses G to find F, and so on.

    When we want to look at any historical commit, we pick it out, by hash ID, and tell Git: attach HEAD directly to that commit. We can draw that like this:

    ...--F   <-- HEAD
          \
           G--H   <-- main
    

    When we run git log now, Git translates HEAD to a hash ID—which it finds directly this time; there's no attached branch name—and shows us commit F. Then git log moves on from there, backwards. Where are commits G and H? They are nowhere to be seen!

    But it's OK: if we run git log main, git log starts with the name main, rather than with the name HEAD. That finds commit H, which git log shows; then git log moves to G, and so on. Or, we can even run:

    git log --branches
    

    or:

    git log --all
    

    to have git log find all branches or all refs ("refs" include branches and tags, but also other kinds of names).

    (This brings up another, separate can-of-worms, which is all about how git log handles the case of "wanting" to show more than one commit "at the same time". I won't go there at all, in this answer.)

    This "viewing a historical commit" mode, in Git, is called detached HEAD mode. That's because the special name HEAD is no longer attached to a branch name. To re-attach your HEAD, you simply choose a branch name, with git checkout or (Git 2.23 or later) git switch:

    git switch main
    

    for instance. You've now checked out the commit that the branch name main selects, and HEAD is now re-attached to the name main.

    Before we stop, there's one more really important thing to learn, which is: how branches grow. But let me get footnotes out of the way first.


    1There's an exception to this rule, necessary in a new, totally empty repository that has no commits at all. That exception can be used in a weird way later, in a non-empty repository. You won't be making use of this though.

    2The lowercase variant, head, often "works" on Windows and macOS (but not on Linux and others). However, this is deceptive, because if you start using the git worktree feature, head (lowercase) doesn't work correctly—it gets you the wrong commit sometimes!—while HEAD (uppercase) does. If you don't like typing in all-caps, consider using the shorthand @ character, which you can use instead of HEAD.

    3Git uses cryptographic hashing here: the same kind of stuff one finds in cryptocurrencies, though not as strict (Git currently still uses SHA-1, which is already outdated in cryptographic terms).

    4The snapshots are stored in a special, read-only, Git-only, compressed and de-duplicated format. Git shows commits as "changes since previous commit" but stores commits as snapshots.


    How Git branches grow

    Suppose we have the following situation:

    ...--G--H   <-- main (HEAD)
    

    We now want to make a new commit, but we'd like to put it on a new branch. So we first as Git to make a new branch name, and point that name to commit H too:

    git branch develop
    

    which results in:

    ...--G--H   <-- develop, main (HEAD)
    

    Now we pick develop as the name to have HEAD attached-to, with git checkout or git switch:

    ...--G--H   <-- develop (HEAD), main
    

    Note that we're still using commit H. We're just using it through the other name now. The commits up through and including H are on both branches.

    We now make a new commit, the usual way we do in Git. Once we're ready, we run git commit and give Git a log message to put in the metadata for the new commit. Git now:

    • saves a snapshot of every file (de-duplicated as usual);
    • uses the current commit as the parent for the new commit, so that our new commit—which we'll call I—will point backwards to existing commit H;
    • adds our configured user.name and user.email as the author and committer of this new commit, using "now" as the date-and-time;
    • uses our log message; and
    • actually writes all of this out as a commit, which assigns it its unique hash ID. (The uniqueness comes in part from the date-and-time stamp, and in part from the input hash ID H, and in part from the snapshot we've saved: everything that is in the new commit goes into making up the new random-looking hash ID, which is why we can't predict it.)

    So now we have this new commit I, pointing back to existing commit H:

    ...--G--H
             \
              I
    

    Now Git does the other bit of magic that makes it all work: git commit writes I's hash ID into the current branch name. That is, Git uses HEAD to find the name of the current branch, and updates the hash ID stored in that branch name. So our picture is now:

    ...--G--H   <-- main
             \
              I   <-- develop (HEAD)
    

    The name HEAD is still attached to the branch name develop, but the branch name develop now selects commit I, not commit H.

    It's commit I that leads back to commit H. The name just lets us find the commit. The commits are what really matter: branch names are just there to let us find the last commit. Whatever hash ID is in that branch name, Git says that that commit is the last commit on that branch. So since main says H right now, H is the last commit on main; since develop says I right now, I is the last commit on develop. Commits up through H are still on both branches, but I is only on develop.

    Later, if we like, we can have Git move the name main. Once we move main to I:

    ...--G--H--I   <-- develop, main
    

    then all commits are once again on both branches. (I left out HEAD this time because we might not care which branch we are "on", if both select I. In fact, we can delete either name—but not both—because both names select the same commit and that's all we need to find the right hash ID. If we were to write this hash ID down somewhere, we might not need any name. But that would be ... yucky, at best. We have a computer; let's have it save the big ugly hash IDs for us, in nice neat names.)