I am looking to add version control to large sensitive data files which I have placed in the .gitignore file. Ex. My repository is structured like:
project/
script1.py
script2.py
data/
sensitive_large1.txt
sensitive_large2.txt
With a .gitignore file:
data/
However, I still want to be able to version these sensitive files locally to track the changes made on them. I looked into git submodules but am not certain if this solves my problem. If I ran git init submodule
inside the data directory would that be sufficient to track those files locally, or is there a better solution?
You don't need to (and probably you shouldn't) use submodules for this. But you can init new git repository in data
directory. So you finally will have 2 repositories, not connected with each other:
data
directory (it should be ignored in gitignore
).data
directory with all sensitive data. You're pushing to remote only changes from first repository, and second repository is only local.