Search code examples
gitversion-controlterminologydistributed

What does origin/main and origin/HEAD mean in my commit, and why are they red?


I've started using Git, and I've recently learned to use

git log --all --graph

to look at my commit history, and I noticed some details that I find worrying. For example, the line to the left is red, which indicates to me that something is wrong. Moreover just after the name of the commit there is some text which I have trouble interpreting. I think that HEAD -> main means that the HEAD is pointing to main. But then there are 2 more strings (also in red indicating something is wrong) which I think are remotes?

Is the line to the left indicating something is wrong, or am I reading it wrong? What does the text next to the commit number mean? What do the three parts mean? Is there something wrong, and if so how can I fix it?

This is the topmost line in my log. What does origin/main and origin/HEAD mean and why are they red?

* commit 9bee2dac5bb71b843022409d33a46af52be217d4 (HEAD -> main, origin/main, origin/HEAD)

Solution

  • Git is really all about commits. The commits have hash IDs:

    * commit 9bee2dac5bb71b843022409d33a46af52be217d4 (...)
    

    That 9bee...7d4 thing there is the hash ID for your latest commit on your main branch. The commit that is the latest will change over time, as you and others add new commits, but for now, 9bee2da... is the latest.

    If Git didn't have easier ways to name commits, this would drive all the humans working with Git insane. (Some say that Git does that anyway.) Imagine having to memorize these big ugly random-looking things just to get your commits! But you don't have to do that: Git will save the hash ID of the latest commit of each of your branches, in a branch name.

    Repositories, then, consist primarily of two databases: one—usually the biggest—holds the commits and supporting internal Git objects. These all have hash IDs; the commits' hash IDs are the ones you will deal with. To help you—and Git itself—find the commit and other hash IDs, the repository also stores a bunch of names: branch names, tag names, refs/stash for git stash, and so on. Each name in this big table of names stores one hash ID, and that's the other database in a repository: a set of <name, hash-ID> pairs. Some of these names are branch names, and those are the branches in the repository.

    (The objects in the objects database are fully read-only: once you make a commit you can never change it. The names in the names database store hash IDs, but the stored hash IDs can be replaced at any time, or even deleted and/or new names created. So the names change over time, to select the latest commit. We'll skip all the rest of the details here, though they do matter.)

    Git isn't just a version control system (VCS) though: it's a distributed VCS. Git does this distribution trick by letting us copy a repository. We use git clone to do that. When we clone someone else's Git repository, we get all their commits1 and none of their branches. The hash IDs of their commits are the hash IDs of our commits at this point: the hash ID of any one commit is totally unique to that particular commit, and every Git repository that has that commit, uses that hash ID for it. That hash ID is now reserved for that commit. (This is why they're so big and ugly: that way, that there's always a fresh new ID available for all of your new commits.)

    To remember their branch names, our Git, in our clone, creates or updates our own remote-tracking names. Our Git software remembers the URL we used to make the clone, using a shorter (than a URL) name. There's a standard first name here, origin, which everybody uses,2 so the URL for their Git repository is stored under the name origin. Git then uses that same name to stick in front of each of their branch names: their main becomes our origin/main, their develop (if they have one) becomes our origin/develop, and so on.

    So your origin/* names are simply a reflection of the fact that the repository you cloned has some branch names. Their branch names = your remote-tracking names, because your Git sees their names and changes them, to let your repository remember their branches.

    Because you and they share the commits—you got all your commits from them—their origin/main remembers commit hash ID 9bee2da.... Once your git clone operation has copied their commits and modified their branch names, your own Git software creates one new branch in your repository. The name your Git used here was main.3 So your Git created your own main to match your origin/main that your Git uses to remember their main.

    That means you have:

    • the branch name main: this is your name for (at the moment) commit 9bee2da...;
    • the remote-tracking name origin/main: this is your copy of their branch name that, the last time your Git talked with their Git software and repository, named commit 9bee2da....

    So when you run git log, your Git:

    1. uses the name main to find 9bee2da...;
    2. uses the hash ID 9bee2da... to find the commit;
    3. shows the commit 9bee2da...;
    4. adds, in parentheses, the decorations you saw: some in green, some in red.

    (There's actually a step before step 1: your Git uses HEAD to find main. But we're leaving that part out for now.)


    1We can get less than all commits, but "copy all commits" is the way to think about this at first, at least.

    2You can use something else if you want. If you do, your remote-tracking names will be a little different. But everyone else will expect to see origin, so there is no point to doing extra work to use a different name.

    3You tell your Git, at git clone time, which of their branch names you want your Git to copy. To select their develop, for instance, you'd run git clone -b develop .... If you don't pick a branch name, your Git asks their Git which name they recommend, and usually that's their main or master or whatever.


    Green and red, but nothing wrong

    For example, the line to the left is red, which indicates to me that something is wrong.

    No: Git just uses eight colors by default, namely red, green, blue, yellow, cyan, magenta, black, and white. This is all that was commonly available 20 years ago. There are some additional words allowed here, and modern Git can use 24-bit color if it's supported by your terminal (see Git pretty format colors). For the line-drawing part (see Pretty Git branch graphs), Git starts with red (and there's no easy way to configure this).

    I think that HEAD -> main means that the HEAD is pointing to main.

    That's correct: the special name HEAD—which is not actually a branch name—normally holds the name of some branch; when it does, we say that the name HEAD is attached to, or points to that branch name.

    But then there are 2 more strings (also in red indicating something is wrong) which I think are remotes?

    These are the remote-tracking names I mentioned above. git branch -a will show both branch names (which are always local, so "local branch" is redundant, like saying "ATM machine", but sometimes it feels appropriate anyway) and remote-tracking names; it prints the branch names in green by default, and the remote-tracking names in red by default. I'm not sure if this is meant as a mnemonic device ("red = remote"), but you can use it as one, if you like. (But then what does green equal? "Glocal" sounds way too much like "global". 😀)

    The second red name is origin/HEAD: this is your Git's copy of the other Git's HEAD, more or less. However, Git doesn't update it, the way it updates remote-tracking names.4 If you think their HEAD may have changed, you can run git remote set-head origin --auto to have your Git call up their Git and find where their HEAD is now. But there's very little use for origin/HEAD in my opinion, so I never bother.


    4Every time you run git fetch origin, your Git calls up their Git again, using the URL Git saved under the name origin, and picks up any new commits they have, that you don't, and updates your remote-tracking names based on their branch names. In fact, git clone is really just shorthand for running git init + git remote add origin ... + git fetch origin + a few more steps: it's the git fetch step that creates all your remote-tracking names initially, and gets all their commits initially. Since git fetch defaults to get the commits they have that I don't, and initially you have no commits, it initially gets all their commits. Since it doesn't create or update any branch names—only remote-tracking names—that's why git clone copies all their commits and none of their branches.