Search code examples
huggingface-transformersgit-lfs

If I just want to use current version of files, do I still need the objects under .git/lfs?


Use an example to explain.

I use these commands to download bart-large from huggingface:

git lfs install
git clone https://huggingface.co/facebook/bart-large

The downloaded folder bart-large has size of 11 GB. The size of .git/lfs alone is 5.2 GB.

I remove the objects under .git/lfs and still can load the models from the local bart-large folder. This makes me wonder if I just want to use the current version of models and will not modify the repo, do I still need objects under .git/lfs?


Solution

  • You should not just delete files from under .git/lfs. If you want to prune all unpushed objects, including those in the working tree, you can use git lfs prune --force to do so, which will free those objects. On some systems, you can de-duplicate the files in the working tree with git lfs dedup instead of using git lfs prune --force; whether that works depends on your operating system and file system. (It works on macOS with APFS, Windows with ReFS, and Linux with XFS and BTRFS, possibly among others.) You can try it and it will inform you whether de-duplication is possible.

    Note that git lfs prune --force does not delete unpushed objects, since they're not on the server; doing so would cause data loss, which is why you shouldn't just delete files from under .git/lfs, since that does, too.