Search code examples
gitgit-clone

How do you add missing branches to a clone of a cloned repository?


git clone and git clone --mirror are quite different:

A standard git clone may be used as a workspace. The set of branches known to the origin are available to be checked out and worked on.

A mirror is more like a backup. You cannot use it directly as a workspace.

Now suppose you clone a repository that has already been cloned as per this question: Git cloning a repository that is already a clone

The resulting repository (clone2) has only the branches that have been used in the first clone (clone1). However, clone1 still has knowledge of the branches at the origin. Is there a way to add a branch known to clone1 to clone2 without setting the (original) origin as a remote?

In case that isn't clear we have:

repo1:

  • branch1
  • branch2
  • branch3
  • branch4

clone1 = git clone repo1.git:

  • branch1 - checked out
  • branch2 - previously checked out
  • branch3 - not checked out but can be at any time
  • branch4 - not checked out but can be at any time

clone2 = git clone clone1.git:

  • branch1 - checked out
  • branch2 - not checked out but can be at any time

clone2 appears to have no knowledge of branch3 or branch4 so cannot check them out. How do we get that information from clone1?

Actually there are two questions here:

  • how do we get information about just one branch from clone1 to clone2?
  • how do we get information about all possible branches from clone1 to clone2?

I believe that branch3 and branch4 are available to clone1 while repo1 is offline.

There are several use cases for this:

  • repo1 is unavailable and we wish to recreate it (bonus points if you know a way to recreate repo1 from clone1)
  • repo1 is temporarily unavailable and we wish to work on a different branch getting that information from someone who has already cloned repo1.
  • testing some changes that involve repo hacking before pushing them to a shared repo (repo1).

I believe it should make no difference but I am testing this using local files rather than URLs. So clone2 is actually made via git clone /local/path/.git

Update

There is a complication I failed to notice initially and report: git branch -r on clone2 should list the branches on clone1 as origin/branch3 and origin/branch4 as suggested in answers. However, for this particular repo it doesn't. I don't know why.

Things that might be special about this repo include:

  • use of .git/refs/replace

pulling the replacements with git pull origin 'refs/replace/*:refs/replace/*' makes no difference.

Any other suggestions?

I have identified what is probably the significant difference between repos for which git branch -r works and the one for which it doesn't.

clone1 and clone2 should both list the remotes in .git/packed-refs with lines like (/path/to/clone1/.git/packed-refs):

2c3c761fbac82556c2178cb28a4e728360093e67 refs/remotes/origin/branch1

For some reason clone2 on the affected repository does not have all the entries in .git/packed-refs that it should.

I checked and some of commit Ids to which the packed-ref file (in clone2) refers exists in the cloned repositories packed-ref (clone2) but some don't. We seem to have both lost and gained branches!

If I experimentally copy the packed-ref file from clone1 to clone2 the branches appear under

git branch -r
they can be checked out but result in a detached head state.

Here is the git config for the 'affected' repo.

>cat .git/config
[core]
    repositoryformatversion = 0
    filemode = true
    bare = false
    logallrefupdates = true
[remote "origin"]
    url = /path/to/clone1/.git
    fetch = +refs/heads/*:refs/remotes/origin/*
[branch "develop"]
    remote = origin
    merge = refs/heads/develop

The [https://git.wiki.kernel.org/index.php|standard instructions] for grabbing all remote tracked branches work even for the broken repo:

git clone --mirror original-repo.git /path/cloned-directory/.git          
cd /path/cloned-directory
git config --bool core.bare false
git checkout anybranch

so there are several workarounds even for a broken repo.


Solution

  • It seems to me that you're looking at the very different typical use cases for clones with --mirror and those without, and letting that lead you to think they are fundamentally different. Actually they're just commonly-used special cases of a more general thing.

    That's mostly an aside, but I think if you study git concepts with an eye on really understanding the above statement, then the rest of this may be more clear as well.

    So: in clone1, the "knowledge" of the other branches is in the form of remote branch refs (refs/remotes/origin/branch3, ...) The branches that are checked out have, additionally, "local" branch refs (refs/heads/branch1, ...). The default refspec used in a clone (whether for a mirror or otherwise) is set to fetch refs/heads/*. (The difference is that a mirror maps them locally as refs/heads/* while a "regular" clone maps them to refs/remotes/origin/* by default.)

    You could set the refspec in clone2 - either through settings, or in the arguments to a specific fetch or pull - to read the refs/remotes/origin/* refs from clone1. But there are some issues to think about.

    First, if you're going to map both the local and remote refs from clone1, then you need to give them different namespaces in clone2. That is, refs/heads/master in clone1 is distinct from refs/remotes/origin/master in clone1 and they might refer to different commits at any given time; so they can't both map to the same name in clone2.

    Second, clone2s knowledge of - for example - branch3, is rather indirect at this point. "The last time I spoke to clone1, it told me that the last time it spoke to repo1, branch3 was at commit XYZ." It probably makes more sense to get knowledge of branch3 "from the horse's mouth". You'd do that by adding repo1 as a second remote on clone2.

    Whether you go about it by adding repo1 as an origin, or by using non-default refspecs to copy the information from clone1, ultimately in clone2 you'll have multiple remote refs corresponding to some branch names (e.g. refs/remotes/origin/branch3 and refs/remotes/repo1/branch3). This means it may not always be clear which branch should be treated as "upstream" of the local refs/heads/branch3. You manage this through configuration, and/or through arguments to specific push and fetch commands telling them what you intend as the upstream in that instance.

    Translating all of that into specific commands really depends on what you're trying to accomplish; there are just too many possibilities to list them all out and explain when you'd use any given one of them. If you need that level of detail, I'd suggest that the documentation for git config, git fetch, git push, and maybe git pull would be the places to start.