Search code examples
gitgit-submodulesgit-annex

Annexed submodules in git


I'd like to keep some binary files (documentation, executable binary files, images, etc) in a git-annex, and then include them in several projects as git-submodules. I think this will allow me to keep track of the correct versions of these large files as they change, keeping old projects linked to the old versions and new projects to the new versions.

So I make the following repo for my big files:

mkdir annexedrepo
cd annexedrepo
cp big_files annexedrepo/
git init
git annex init
git annex add .

and then go to my project repo and add them as a submodule.

cd ../otherrepo
mkdir data
git submodule add ../annexedrepo data/annexed

I'd love if these would just appear as symlinks to the correct files in the other repo. But I guess it's good enough if I can just make the copies as I need them with:

git annex get data/annexed

This copies the files over - I can see them in otherrepo/.git/module/data/annexed/objects/. But when I do this, the annexed files are just dead symlinks. I can list them with ls data/annexed/, but nobody's home.

Am I trying to do something wrongheaded? Is there a way to fix this? Are these bugs in either git-submodule or git-annex? Thanks for your help!


Solution

  • I am using the same source tree structure and also tried to use git-annex but met the same problem. I found out the git-fat extension can be used instead of git-annex and has no such an issue. So my source tree looks like this:

    /project
        .git
        .gitmodules
        ...
        <project files and folders>
        ...
        submodule
            .git
            .gitattributes
            .gitfat
            ...
            <binary files>
            ...
    

    To clone such a project

    git clone git://... project
    cd project
    git submodule init
    git submodule update
    cd submodule
    git fat init
    git fat pull
    

    The git-fat uses rsync to push/pull files. See more about git-fat.