Search code examples
gitgithubgithub-for-windows

Upstream is a repo or a branch?


I have a question around upstream, if I follow a right process to keep my fork updated with the original repo. I should do something like

git remote add upstream 'link'

and then

git fetch upstream 

to update that upstream/master

that means, it is a tracking repo. Can it be created as upstream tracking branch also? and then I can switch like

git checkout branchname

What's the difference between the two approaches?


Solution

  • I think you're confusing remote repositories with remote tracking branches.

    What's a remote repository?

    Given a repository R, a remote repository is a clone of R that's physically separated from it, usually by a network.

    If you want to keep track of what happens in a remote repository, you add a reference to it from R. This reference is called a remote and it's usually named origin, by convention.

    From the Git glossary:

    Most projects have at least one upstream project which they track. By default origin is used for that purpose.

    What does "upstream" mean?

    As far as Git is concerned, all repository are created equal — however, in almost all projects there is a hierarchy of repositories, where at the top is the repository everybody agrees on is the canonical one.

    Here's an example:

          +-------+
          |       |
          |  Git  |         <-- upstream
          |       |
          +---+---+
              ^
              |
    +---------+---------+
    |                   |
    |  Git for Windows  |   <-- origin
    |                   |
    +---------+---------+
              ^
              |
       +------+------+
       |             |
       |  Your Fork  |
       |             |
       +-------------+
    

    Seen from the perspective of Your Fork, if you want to track what happens in Git for Windows, your would call that remote repository origin because that's the one you cloned.

    If you also want to keep track of what happens in the canonical Git repository, you would name that remote reference upstream because it's further up the hierarchy of repositories where changes stream downwards.

    The Git glossary summarizes the distinction between origin and upstream well:

    Most projects have at least one upstream project which they track. By default origin is used for that purpose. New upstream updates will be fetched into remote-tracking branches named origin/name-of-upstream-branch.

    This brings us to our next question.

    What's a remote tracking branch?

    A remote tracking branch is a branch that exists in a remote repository that you want to track.

    Again, from the Git glossary:

    A ref that is used to follow changes from another repository.

    These branches are usually named:

    <remote-name>/<branch-name>
    

    to allow you to differentiate them from your local branches, that is branches that exist only in your local repository.

    For example:

    upstream/master  <-- master branch in the upstream repo
    origin/master    <-- master branch in the origin repo
    master           <-- local master branch
    

    Keep in mind that remote tracking branches are read-only — you can update them with whatever new commits happened in the remote branch by running git fetch, but you can't commit to them. In that sense, you can think of them more as bookmarks.

    What's an upstream branch?

    When you track a remote repository from a local one, chances are you'll be working on the same branches as your fellow contributors.

    However, we know that remote tracking branches are just bookmarks, so we can't commit to them directly.

    The solution is to create local branches (which we can commit to) and associate them to the remote branches. This allows us to quickly bring in changes from the remote tracking branch into our local one and vice versa using git pull and git push.

    In this case, the Git glossary is a bit more mysterious with its definition:

    upstream branch
    The default branch that is merged into the branch in question.

    In other words, an upstream branch is the remote counterpart of a local branch; this relationship exists solely as a convenient way of keeping those branches in sync without having to explicitly reference them by name.