Search code examples
gitgit-bare

What is the use of git bare repositories?


On the remote machine I create a bare repository with the following command:

git init --bare $HOME/bare/My-Repo

On my local machine I clone a repo (but not the one that I have created on the remote machine):

git clone ssh://something.com/My-Repo

On my local machine, in the repository that I have just cloned, I create a "reference" to the above mentioned bare repository located on the remote machine:

git remote add my_remote ssh://remote.com/${USER}/bare/My-Repo

I do not know if it is relevant to my question but I also do the following stuff in my local repository:

git config remote.my_remote /some/path/git-receive-pack
git config remote.my_remote /some/path/git-upload-pack

Now, I can push from the local repository to the bare repository on the remote machine:

git push my_remote master

Now, the idea is that we create another repository on the remote machine which should take some content from the bare repository. I create this repo by executing something like this on the remote machine:

some_script.sh $HOME/bare/My-Repo $HOME/My-Repo

Now, I do some changes in the $HOME/My-Repo on the remote machine (please note that it is not the bare repository), then I git add and git commit and after that I push:

git push origin my-dev-branch

As a result I push to the bare repository (which is located on the same remote machine). After that I go to my local machine and "take" the changes from the remote machine (from the bare repository):

git fetch my_remote
git checkout -b my-dev-branch my_remote/my-dev-branch
git fetch origin
git rebase origin/master

So, my questions are: Why do we need this bare repository? Why can't we just have one remote repository and directly exchange content between the local and remote repositories? Or, alternatively, why can't we just rsync (or scp) the local repository to and from the remote machine?

ADDED

This is why I ask the question instead of googling. I have googled my question and the first link is here: http://www.saintsjd.com/2011/01/what-is-a-bare-git-repository/

I read there:

Repositories created with the git init command are called working directories. In the top level folder of the repository you will find two things: A .git subfolder with all the git related revision history of your repo A working tree, or checked out copies of your project files. Repositories created with git init --bare are called bare repos. They are structured a bit differently from working directories. First off, they contain no working or checked out copy of your source files. And second, bare repos store git revision history of your repo in the root folder of your repository instead of in a .git subfolder. Note… bare repositories are customarily given a .git extension.

So, now to understand that I need to know:

  1. What is "revision history"?
  2. What is "working tree"?
  3. What is "checked out copies"?

And this is just the first section. I know, that the response would be: Go and read the git tutorial or introduction to git. Well I did it. The first problem with it is that the use of terms is "circular". Term A is expressed through term B and term B is expressed through term A. Second, I do not have time to read a book just to find an answer on my question. Third, I am 100% sure that the answer on my question can be expressed in simple terms. It is actually trivial.


Solution

  • If I understand correctly, your main question boils down to this: why do we need an intermediary bare repository to sync between two Git repositories?

    In your concrete example you have a local Git repository, let's call it X, and two remote repositories, a bare one let's call it B, and a non-bare one, let's call it Y.

    At this point I hope you see that X and Y could be anywhere. They could both be local, or they could both be remote, on the same server as B. It doesn't matter where they are, what matters is that you want to sync between them, and for that you need B, somewhere, anywhere. And your question is why?

    First of all, you don't really need B in the middle. It's recommended, and the default behaviors will guide you in this direction.

    Why is it recommended to have a bare repo between repositories? The answer has to do with working trees. The working tree is where you have the files you work on. The Git repository of a working tree is inside the .git subdirectory. The Git commands you execute inside the working tree communicate with the Git repository, query its content, and make comparisons with the content you have in the working tree. A bare repository is like the .git directory alone, without a working tree.

    To demystify bare repositories, try this to see the difference between a bare and non-bare repo:

    $ cd /tmp/
    $ git init --bare bare.git
    Initialized empty Git repository in /private/tmp/bare.git/
    $ git init regular
    Initialized empty Git repository in /private/tmp/regular/.git/
    $ diff -r bare.git/ regular/.git/
    diff -r bare.git/config regular/.git/config
    4c4,5
    <   bare = true
    ---
    >   bare = false
    >   logallrefupdates = true
    

    Technically, the difference is a matter of a few configuration flags. I could move the bare.git directory to any directory in the filesystem, rename it to .git, edit the config file and voila, I will have converted a bare repository a regular one.

    Back to the question: why is it recommended to have a bare repository between two non-bare repositories that want to sync with each other? By default, Git forbids pushing to a non-bare repository. What if it didn't? If git push will update the content of the .git of the remote, what should happen to the corresponding working tree? A lot of questions would follow:

    • What is the current branch of the working tree? If the push didn't change that branch, then everything is fine.
    • What should happen if the push affected the current branch of the working tree? Should Git update the working tree?
      • If the working tree is not clean, it cannot be safely updated. Uncommitted changes can get lost.
      • Even if the working tree is clean, there can be conflicts if there are changes to .gitignore, or if the history of the branch was re-written with force push.
      • If we don't update the working tree, it will be out of sync with the repository, so git status would report modifications that should not be committed.

    Does this sound complicated and confusing? That's because it is. And it's the reason why pushing to non-bare repository is not allowed by default. You can do it if you really want, it can be reasonably safe if you consistently keep clean the working tree of the repo you push to, but it would be error-prone (you might forget), and nobody needs such uncertainty and mental burden. It's really best to just use a bare repository in between.