Search code examples
gitgit-branchgit-remote

git is not having all branch files in ref dir AND not maintaining commit id in branch reference file. Why?


From, my understanding, GIT keeps track of branches using plain text files with name same as the branch name. These files are stored in .git\refs\remotes\origin for remote tracking of remote branches and for local branches these file are in .git\refs\heads

Below is the output from git branch:

$ git branch -a
  joincolumn_issue
* master
  remotes/origin/HEAD -> origin/master
  remotes/origin/joincolumn_issue
  remotes/origin/mappedBy
  remotes/origin/master
  remotes/origin/todelete

First part of the problem/questions:
As you can see there are several remote branches git is aware of... but on looking at .git dir i don't see all of them -

Samsh@Sambox MINGW64 /d/graphql-hibernate/.git/refs/remotes/origin (GIT_DIR!)
$ ls
HEAD  joincolumn_issue

Why are the files for other branches not present. Ok. The branches other then joincolumn_issue have never been checkout from remote. So if that is the reason. Fine, if at all that's the case, then how and from where does git obtain the other branches details (as it's listing them in git branch -a, it's definitely not polling the repo for this query)

Part two of the problem/question: On looking at the contents of the files in the ref dir-

Samsh@Sambox MINGW64 /d/graphql-hibernate/.git/refs/remotes/origin (GIT_DIR!)
$ cat joincolumn_issue
1950d716308e5063f1b8f28c2423166781335333

This is as expected pointing to a commit id. fine. But the problem is with below output.

$ cat HEAD
ref: refs/remotes/origin/master

HEAD is referring to master, and there is no such file in .git dir. So now you understand my problem, I am not able to see how git is able to figure out the tip of master with out knowing/tracking the related commit id.


Solution

  • GIT keeps track of branches using plain text files with name same as the branch name.

    Sometimes, yes. Sometimes no. You're not supposed to care. Why are you trying to inspect the .git/refs/heads/ files at all?1

    Git has a database,2 somewhere, of name-to-hash-ID mappings. You can extract hash IDs from names using the git rev-parse program:3

    git rev-parse refs/heads/master
    

    gives you the hash ID for refs/heads/master, i.e., branch-name master. You can iterate over some or all references with git for-each-ref (see its documentation).

    In general, you set names to hash IDs using higher level interfaces, e.g., git branch or git tag; but you can write the database directly using git update-ref:

    git update-ref [flags] refs/heads/master <new-hash> [<old-hash>]
    

    means store the new hash; if I provide the old hash, make sure the name maps to the existing hash ID before making the change.


    1Debugging, learning, or just plain stubbornness, perhaps. :-)

    2Internally, in the C code, Git has the concept of "back ends" that implement reference store and load. The current "database" is really pretty crappy, and consists either of the flat file .git/packed-refs or the file taken from .git/refs/<path>. The flat-file "database" works fine—it stores the names master and MASTER separately, for instance, so the two branches are different, as Git intended them to be—but the local-file-system file tree "database" fails badly on some systems where you cannot create files named both master and MASTER. It's pretty clear that large server providers like GitHub could use a real database: some repositories have thousands of tags and dealing with the flat-file and file-tree is, shall we say, "not good".

    3Rev-parse does a lot more than just look up a name in the database, but it does do that, so why not use it?