Search code examples
gitgit-submodules

Git: turn a directory with repos into a repo with submodules


I'm developing my project in a directory "software". The project is divided into subdirectories:

software
  |
  +-- common_forms
  |     +-- .git
  +-- common_utils
  |     +-- .git
  +-- gui_app
  |     +-- .git
  + ...
  ... and so on.

Each subdirectory is a git repo; the master directory "software" is not a repo, just a root for the repos.

Now I want to turn the master directory "software" into a repo, and include those repositories as master repo's submodules. How can I do it not destroying the existing structure of directories and repos?


Solution

  • Let me add one word—sort of redundant, but important for emphasis—here:

    Each subdirectory is already a git repo; the master directory "software" is not a repo, just a root for the repos.

    In this case, just enter the software directory, run git init to create a .git directory / repository with this as its top level working tree, and use git submodule add for each of the various submodules. See the git submodule documentation for specifics of git submodule add. Then git add the .gitmodules file and commit, to make the initial commit.

    Note that the repository argument to each git submodule add command can be the same as the path argument. This affects what winds up in the .gitmodules file, which—once you make a commit in the superproject you've just created—is now available for future clones of this superproject.

    That is, in the future, someone—let's give him or her a name, e.g., Pat—will run:

    git clone $URL [<path>]
    

    to clone the submodule from some URL (for now I'm assuming it's in $URL, though Pat might type in a literal string rather than use a variable). Then Pat will cd into the clone and run, e.g., git submodule update --init.

    Pat's clone here needs to have instructions for git submodule init or git submodule update --init. These instructions tell Git what all the submodules are and how to git clone each one. That is, rather than Pat having to, now, run one git clone for each submodule, Pat's git submodule update --init allows Pat's Git to read, from the .gitmodules file, the URLs and path names needed for each of these git clone operations.

    A path-relative pseudo-URL, such as ./common_forms, turns into a URL-relative URL. That is, since Pat cloned $URL to make the superproject, the .gitmodules URL ./common_forms turns into the URL ${URL}/common_forms. If this is the right URL, everything is good.

    On the other hand, suppose you plan to push the superproject to host1.example.com/supers/superproj.git but all the submodules will live at URLs like host2.example.com/subs/common_forms.git. Then when you create the .gitmodules file—or before committing it, at least—you should make sure that the URL that will be stored in the .gitmodules file is host2.example.com/subs/common_forms or host2.example.com/subs/common_forms.git. That way Pat will still be able to run git submodule update --init.

    (It makes the most sense to arrange for the repository hosting to mirror the superproject-and-submodule layout, when that is possible, but it's not always possible.)