I have a project consisting of a huge amount of data.
Because of its size, I can’t use a remote GIT repo and push/pull through the Internet. Instead, I carry a portable HDD with me, which contains the current state of the project (i.e. the workdir).
The GIT repo of this workdir is on another HDD inside my desktop computer (I used --separate-git-dir
to achieve that).
From time to time, I bite the bullet, connect my external HDD to my desktop, and make another gargantuan GIT commit, in order to track the history of the project data.
The problem is that within this project, there are several small subprojects tracked by their own GIT repos. They are (relatively) lightweight, and receive commits on regular basis.
portable HDD desktop HDD
| |
|-.git <- text file (gitlink) to here -> |-ProjectGit
| | |-objects
|-project1 | |-refs
| |-.git <- actual git dir | |-HEAD
| |-some files . . ...
|
|-project2
| |-.git <- actual git dir
| |-some files
|
|-loads
|-and
|-loads
|-of
|-files
When I try to do a git add --all .
inside the main superrepo, GIT understandably gets angry that there are nested .git
folders and yells at me that I should use submodules.
And I would love to do just that, except that submodules reside either (a) in the .git/modules
folder of the superrepo, or (b) it is possible to force the legacy (outdated) mode and store the submodule inside the workdir. In case (a), I won’t have the .git
folders on my external HDD and won’t be able to commit the changes in the subrepos during work; and in case (b) the superrepo’s .git
folder won’t have a copy of the subrepo commits, and thus if the portable HDD gets screwed the data is lost.
I want some way to pull all commits residing within the nested subrepos into the desktop HDD each time I make a commit of the superrepo. The only way I could think of so far is to somehow use git hooks, and attach a script to them which will automatically pull all changes into several small repos residing on the desktop HDD alongside with the superrepo’s git dir.
I ended up just using the "old submodules" option:
git submodule add --name NAME RESERVE_HDD_PATH PORTABLE_HDD_PATH
, where NAME
is some valid dir name, RESERVE_HDD_PATH
is the path to subrepo on the desktop HDD, PORTABLE_HDD_PATH
is the original subrepo path that you wrote down in step 1, relative to the superrepo's root.git
files that were created in workdir, and copy the original subrepos back from the desktop HDD instead of those filesmodules
folder from the superrepo's git dir (it's superfluous)That's it. Now you can work in subrepos using the portable HDD, and every time you connect it to the desktop and make a commit of the superrepo, it will remember the current commits of all subrepos. You just have to make a reserve copy of those subrepos, e.g. with a script like this (residing on the desktop HDD near the subrepo folders):
#!/bin/bash
while read filename
do
echo "Pulling into $filename..."
cd "$filename"
git pull hdd master
cd ..
done < submodules-list
where submodules-list
is a text file with contains the list of your subrepos.
I guess I could automate this even further using git hooks, but I'm content with how things are right now.