Search code examples
gitgithubcommit

Git tries to push oversized files that are no longer in repo nor in cache


Summary: I made a git commit that contained oversized files and, when trying to push, got the dreaded "large files" error. I restructured the repo to have a new top-level directory that no longer contains any large files, but I still get the "large files" error when trying to push. I tried various common solutions (below), but git keeps trying to push files that are outside the new top-level repo.

Details on what I did:

  1. I manually removed the .git and .gitignore files to my desired new directory, as described here.

  2. I confirmed that the new root directory was successfully recognized via git rev-parse --show-toplevel.

  3. I tried to push to the remote again (git push origin main), but got the error File <filepath> is 102.90 MB; this exceeds GitHub's file size limit of 100.00 MB, where <filepath> is a path inside the old directory, not the new one.

  4. I tried to remove the file from the cache via git rm -r --cached <filepath> (as described in the accepted answer here), but this yields the error fatal: <filepath> is outside repository.

  5. I reset via git reset HEAD~, then tried again to push, but I got the same error as above.

  6. I tried to filter the branch history to remove commits involving the large file (stitched.csv) via git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch stitched.csv' HEAD, as described here. Then I tried to push again and still got same error, again referring to stitched.csv.

In practice, I have quite a few oversized files, so I would really rather not have to remove each one from the cache manually. I have made numerous good commits since the ones that involved large files.

Any help would be much appreciated.


Solution

  • As commented, you need to filter and remove those large files from your Git history.

    The more recent option is now the third-party tool git filter-repo (with its installation process, and using Python)

    In order to not have to list every large file, you can determine a size above which you want any file to be removed:

    git filter-repo --strip-blobs-bigger-than 2M
    

    Replace "2M" (two Mo) by an appropriate size: see "How to find the N largest files in a git repository?".