Search code examples
gitgit-branchgit-checkoutgit-forkupstream-branch

Difference between `git checkout -b newbranch upstream/newbranch` and `git checkout newbranch`


I have read this answer about importing an upstream branch into a fork. The answer suggests using git checkout -b newbranch upstream/newbranch to switch to the new branch. I always just used git checkout newbranch in this case and it worked aswell. Is there any difference between these commands? My guess is that I only need -b to specify a branch name in case it should be different from upstream/newbranch. But if I just want the branch with its original name newbranch, is there any difference between using git checkout -b newbranch upstream/newbranch and git checkout newbranch? I have read the docs of -b but that doesn't actually answer my question.


Solution

  • The existing answer doesn't cover exactly how this works, which is a little bit complicated. Internally, Git calls this thing DWIM mode.

    Long-ish: background

    Let's start with this: your branch names are yours. Some other Git might have a branch named newbranch, or branch2, or whatever, but if you don't have that branch name, you just don't have that branch name. Well, not yet.

    Remember also that every commit has a unique hash ID. To see the hash ID of the current commit, run:

    git rev-parse HEAD
    

    The special name HEAD always names the current commit (and usually names the current branch name too, but we'll leave that for later). The git rev-parse command will get you the big ugly hash ID—not all that useful to humans, but crucial for Git, because that hash ID is how Git actually finds the commit.

    Meanwhile, each branch name just holds one (1) commit hash ID. If you have a branch name master, you can find the hash ID that this name represents by running git rev-parse master. As before, git rev-parse turns the name into the big ugly hash ID.

    Now, this means that to create a new branch name, you tell Git: Make a new branch name. Here is the hash ID to store in it: _______. The way you tell this to Git is to use any of various commands:

    • git branch newname: this tells Git to create the new name using the hash ID found by resolving HEAD to a hash ID.

    • git branch newname hash-id: this tells Git to create the new name using the hash ID you type in. Hash IDs are hard to type in, so you'd probably use the mouse to cut and paste one. But you don't have to, because:

    • git branch newname any-other-name-that-works-with-rev-parse: this has Git run git rev-parse on the last name, to find the hash ID, then create the branch so that it contains the hash ID you gave it.

    • git checkout -b name and git checkout -b name start-point: these are very similar to using git branch followed by running git checkout.

    But there's one more way to create a new branch name, and that's to run git checkout name-that-does-not-yet-exist.

    Normally, if you do something like git checkout supercalifragialistic, you just get an error: Git tries to turn that name into a hash ID (using an internal equivalent of git rev-parse) and this fails entirely and the whole thing just stops with an error. But git checkout has built into it a special trick.

    Now, besides branch names, Git supports something that I call remote-tracking names (Git calls them remote-tracking branch names but the word branch here is kind of misleading, so I think it's better to leave it out). These are pretty simple, really: your Git connects to some other Git, when you tell it to. You probably call that other Git origin, since that's the standard name. You will occasionally run git fetch origin or git pull origin master or some such: the name origin here is how your Git finds the URL to use to call up the other Git.

    That other Git, over at origin, has branch names. Your Git remembers their branch names, but since your names are yours, your Git remembers them under alternate names. These are the remote-tracking names. Your Git renames their master to your origin/master, renames their xyz to origin/xyz, and so on.

    In your question you talked about upstream/newbranch. The name upstream is the standard name for a second Git repository, that you add with git remote add. There's one name for each "other Git" you talk to, and the remote-tracking names have the remote name, followed by the other Git's branch name, with a slash between them. So you might end up with both origin/newbranch and upstream/newbranch, and this matters below.

    DWIM mode

    When you run a git checkout that would error out because you don't have the branch, git checkout will try a new trick before actually failing.

    Your Git will scan through all of your remote-tracking names. For instance, you might have origin/master, origin/xyz, upstream/xyz, and upstream/newbranch.

    If you already have a master and run git checkout master, well, you have a master, so that's the one git checkout will use. But if you run git checkout newbranch and don't have a newbranch, Git will scan all the above. Only upstream/newbranch "looks right", so Git will say to itself: Aha, if I automatically create newbranch from upstream/newbranch right now, I can switch to it! So that's what it does: create this as a new branch, and then switch to it. The assumption is that while you said switch to existing branch newbranch, you must have meant create new branch newbranch from upstream/newbranch. Git does what you meant, instead of what you said.

    Note that if you run git checkout xyz, Git has a new problem: there are now two candidates from which to create xyz. It could be created from origin/xyz, or from upstream/xyz. By default, DWIM mode will just not create anything, and you'll see the error.

    (Git 2.21 and later have --no-guess to disable DWIM entirely. This is mainly useful with the bash completion scripts, if you don't want Git to guess all possible remote-tracking names.)

    Several other important things to know

    When you create a new branch name, you can have Git set its upstream:

    • Every branch name has either one upstream, or no upstream.
    • Typically the upstream for master would be origin/master, for instance.
    • The upstream setting gives you more information from git status, and lets you run git fetch, git merge, git rebase, and git pull without specifying anything more. So it's meant to be convenient. If you find it convenient, use it; if not, don't.

    To set the upstream of a branch explicitly, use git branch --set-upstream-to; to remove the upstream, use git branch --unset-upstream. When git checkout uses DWIM mode to create a branch, it will normally set the upstream of that branch to the remote-tracking name it used when creating the branch. You can adjust this with git config; see its documentation.

    When using git branch or git checkout -b, you can explicitly tell Git whether to set the upstream of the newly-created branch, using the -t or --track option (these are the same option: one is just a longer spelling). Note that in the tricky case of having both origin/xyz and upstream/xyz, using:

    git checkout -t origin/xyz
    

    is a short-hand way of running:

    git checkout -b xyz --track origin/xyz
    

    That is, it:

    1. specifies the name to use to get the hash ID when creating xyz locally;
    2. specifies that the local name is xyz because the remote-tracking branch used is origin/xyz; and
    3. specifies that new local xyz should be set with origin/xyz as its upstream.

    Using git checkout -t upstream/xyz works similarly, except that your new xyz uses the commit ID found by resolving upstream/xyz and your new xyz has upstream/xyz as its upstream.