Search code examples
gitgithubversion-controlgit-checkoutgit-clone

Why git checkout after git clone?


I am new to git. I understand git basics as well as development procedures using git, however there is one thing that confuses me.

Whenever I have to pull something from a git repo with multiple projects inside (example), I see these instruction steps:

  1. git clone xxx
  2. cd xxx
  3. git checkout yyy

This is a bit puzzling to me. As I already have the entire repo, why would I want to checkout the project I am interested in if I can just copy the folder and do whatever I like with it?


Solution

  • Note: I started this as a comment but it's long enough, and really can use some formatting, that I moved it to an answer. It is a reply to your comment to Tim Beigeleisen:

    If you refer to my example link, checking out "stereo_image_proc" works but I can't see it being one of the branches, how come?

    The reason git checkout stereo_image_proc does not complain (and seems not to do anything, at first blush) is that git checkout itself is really two different commands combined into one. This is a feature, or misfeature in some people's opinion (including mine): git checkout's argument can be a branch name, or a path name.

    Specifically:

    • git checkout branch asks Git to switch to, or sometimes even create and then switch to, a branch whose name you supply on the command line.

      Switching to a branch is a surprisingly complicated process, in the end, but it starts out simple enough: it changes Git's notion of HEAD so that you're on the named branch. It has another very useful feature: before Git actually switches to (and/or creates) this branch, Git makes sure that this won't clobber any work you accidentally started on the wrong branch.

    • git checkout name1 name2 ... nameN, on the other hand, asks Git to extract particular files from some named or implied commit. This is often best written out as git checkout -- file, where the -- tells Git that the name should not be treated as a branch name. That is, suppose you have a file named master and you want to extract it: then git checkout master would not work because that's the branch named master, but you want the file named master. So git checkout -- master tells Git: not the branch, the file.

      When you use this kind of git checkout, you're telling Git: I know I started editing some file or files, but I have now decided that editing this file, or all of these files, was a mistake. Put them all back the way they were, turning them back into a previous version of each file. For instance, suppose you have a file named README.txt and you started editing it and then realized that you should be creating a new documentation file. You copy the new stuff you added to the new file, but now you want README.txt to go back to the way it was before you started editing it. So you run git checkout README.txt, and that wipes out your changes to the file.

      But as far as Git is concerned, naming a directory here (or a folder if you prefer that term) means every file in the directory, including any sub-directories recursively. Since stereo_image_proc is a directory, and is not a branch name, you are getting this second form of git checkout.

    The bottom line is that git checkout stereo_image_proc tells Git to wipe out any changes you made to any files within that directory. If you have not made any changes, well, no problem! But if you have, this can be pretty disastrous.

    Since git checkout does have these two modes—the safe switch branches mode, and the unsafe clobber all my work mode—you have to take note of which one you are invoking, every time you run git checkout.