I have a project in Git that has several submodules, and I need those submodules to be downloaded and the files available in order to use the main project, and in order for the submodules to work I need their own submodules to be available etc. So to set this up I recursively initialise the submodules using git submodule update --init --recursive
.
However, I've noticed that many of my submodules have shared dependencies, looking something like this in pseudocode (alpha -> beta
represents that alpha
has the submodule beta
)
my project -> submodule a -> submodule m
-> submodule b -> submodule m
-> submodule n -> submodule x
-> submodule c -> submodule x
My question is: is there any way of avoiding this duplication using only git, while still having (at least one copy of) the files for each submodule?
I can imagine a solution with symlinks, but it would be preferable if git handled this for me, and I'm not sure whether putting in the symlinks myself would cause problems when updating the submodules.
Ideally I'd love to simplify it down to:
my project -> submodule a -> symlink(submodule m)
-> submodule b -> symlink(submodule m)
-> symlink(submodule n)
-> submodule c -> symlink(submodule x)
-> submodule m
-> submodule n -> symlink(submodule x)
-> submodule x
Thanks in advance for any suggestions!
This isn't built into git, but you can definitely do it with symlinks like you say. You might want to have a look at git new-workdir
(from git's contrib directory), which does essentially this. It's not aware of anything to do with submodules, but a submodule doesn't know it's a submodule - it's the parent repo that knows about that stuff. I haven't tried this, but I'm fairly certain you could use it something like this:
# remove the target first (new-workdir will refuse to overwrite)
rm -rf submodule_b/submodule_m
# (original repo) (symlinked repo)
git new-workdir submodule_a/submodule_m submodule_b/submodule_m
It works by symlinking essentially all of the .git directory; the notable thing that isn't symlinked is HEAD
; the two directories can have different things checked out, but share the same refs and objects.
From here you should be good. When you run a git submodule
command in the supermodule, it just goes into the submodules and runs appropriate commands there, which will all work as expected.
The one thing you usually need to be aware of with symlinked repos like this is that they share the same set of branches, so if they both have the same branch checked out, and you commit to it in one, the other will become out of sync. With submodules this generally won't be a problem, though, since they're essentially always in detached HEAD state unless you intervene.