Search code examples
gitgit-submodulesgit-clonecloninggit-remote

How do I get `git clone --recursive` to recreate submodules' remotes and branches?


I have a project with a handful of submodules. Many of them are cloned from a GitHub fork to which I've added a branch for my custom mods. A typical setup is like thus:

In local folder: MyProject1/Frameworks/SomeAmazingRepo/

$ git branch -vva
*my-fork                       123456 [my-fork/my-fork] Latest commit msg from fork
master                         abcdef [origin/master] Latest commit msg from original repo
remotes/my-fork/my-fork        123456 [my-fork/my-fork] Latest commit msg from fork
remotes/my-fork/master         abcdef [origin/master] Latest commit msg from original repo
remotes/origin/HEAD            -> origin/master
remotes/origin/master          abcdef [origin/master] Latest commit msg from original repo

$ git remote -v
my-fork                        [email protected]:MyUser/SomeAmazingRepo.git (fetch)
my-fork                        [email protected]:MyUser/SomeAmazingRepo.git (push)
origin                         git://github.com/OriginalOwner/SomeAmazingRepo.git (fetch)
origin                         git://github.com/OriginalOwner/SomeAmazingRepo.git (push)

I git clone --recursive my project to begin a new spin-off project and when it begins to recurse, it spits out an error claiming it can't find the stored commits for these repos. Upon inspection it seems that the remotes haven't been added and the branch is left (empty) in master ...

In local folder: MyProject2/Frameworks/SomeAmazingRepo/

$ git branch -vva
*master                        abcdef [origin/master] Latest commit msg from original repo
remotes/origin/HEAD            -> origin/master
remotes/origin/master          abcdef [origin/master] Latest commit msg from original repo

$ git remote -v
origin                         git://github.com/OriginalOwner/SomeAmazingRepo.git (fetch)
origin                         git://github.com/OriginalOwner/SomeAmazingRepo.git (push)

The only remedy is to go and add the remotes manually to all the repos (very tedious).

There exists a similar issue in the cases where there are two tracking branches as above but only one remote (origin => my github fork). In these case, it finds the commit and checks it out but fails to recreate the tracking branch, leaving a "dangling" commit...very scary as it doesn't warn you!

How do I clone my project so that it reliably recreates the submodules' remotes and branches?


Solution

  • git clone --recursive is equivalent to git submodule update --init --recursive.

    And a git submodule update will only checkout the recorded SHA1 (recorded in the parent repo):

    Update the registered submodules, i.e. clone missing submodules and checkout the commit specified in the index of the containing repository.
    This will make the submodules HEAD be detached.

    2012: So finding no active branch in a submodule is the norm.
    A git submodule foreach 'git checkout master' can at least set the master branch (if you are sure that all the recorded SHA1 were supposed to be part of a 'master' branch for each submodules.


    2013-2014: you can configure your .gitmodules file in order to specify a branch to checkout in your submodule.
    See "How do I update my git submodules from specific branches?"

    cd /path/to/your/parent/repo
    git config -f .gitmodules submodule.<path>.branch <branch>
    

    Any remote that you add locally in a submodule, like my-fork, aren't recorded in the parent repo at all.
    So when you clone again that parent repo, it will initialize and update the submodules as recorded in the .gitmodules file (you can change that address, but only one is associated with each submodules).
    If you have other remote address to associate to each submodule, you need a script to automate the process.

    As explained in "True nature of submodule", a submodule is primarily there to record/access a fixed point in the history.
    You can develop directly within a submodule, but you need to go there and make the right branch and/or add the right remotes.

    it spits out an error claiming it can't find the stored commits for these repos.

    Every time you make a commit in a submodule, you need to:

    • push it to the associated remote (ie, the one recorded in the .gitmodules of the parent repo)
    • go back to the parent repo and commit said parent.

    But:

    If you have pushed to 'my-fork' while the associated remote repo of that submodule was not 'my-fork'... then the next clone won't be able to checkout that submodule commit.


    Update August 2014 (Git 2.1)

    See commit 9393ae7 by Matthew Chen (charlesmchen):

    submodule: document "sync --recursive"

    The "git submodule sync" command supports the --recursive flag, but the documentation does not mention this.
    That flag is useful, for example when a remote is changed in a submodule of a submodule.


    Update Git 2.23 (Q3 2019)

    You can also consider git clone --recurse-submodule --remote-submodules: that will clone with submodules already checked out at their tracking branch instead of their gitlink parent repo SHA1.