Search code examples
gitbranching-and-mergingpack

Remove large .pack file created by git


I checked a load of files in to a branch and merged and then had to remove them and now I'm left with a large .pack file that I don't know how to get rid of.

I deleted all the files using git rm -rf xxxxxx, and I also ran the --cached option as well.

How can I remove a large .pack file that is currently in the following directory?

.git/objects/pack/pack-xxxxxxxxxxxxxxxxx.pack

Do I just need to remove the branch that I still have, but I am no longer using? Or is there something else I need to run?

I'm not sure how much difference it makes but it shows a padlock against the file.


Here are some excerpts from my bash_history file that should give an idea how I managed to get into this state (assume at this point I'm working on a git branch called 'my-branch' and I've got a folder containing more folders/files):

git add .
git commit -m "Adding my branch changes to master"
git checkout master
git merge my-branch
git rm -rf unwanted_folder/
rm -rf unwanted_folder/     (not sure why I ran this as well but I did)

I thought I also ran the following, but it doesn't appear in the bash_history with the others:

git rm -rf --cached unwanted_folder/

I also thought I ran some git commands (like git gc) to try to tidy up the pack file, but they don't appear in the .bash_history file either.


Solution

  • The issue is that, even though you removed the files, they are still present in previous revisions. That's the whole point of git, is that even if you delete something, you can still get it back by accessing the history.

    What you are looking to do is called rewriting history, and it involved the git filter-branch command.

    GitHub has a good explanation of the issue on their site. https://help.github.com/articles/remove-sensitive-data

    To answer your question more directly, what you basically need to run is this command with unwanted_filename_or_folder replaced accordingly:

    git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch unwanted_filename_or_folder' --prune-empty
    

    This will remove all references to the files from the active history of the repo.

    Next step, to perform a GC cycle to force all references to the file to be expired and purged from the packfile. Nothing needs to be replaced in these commands.

    git for-each-ref --format='delete %(refname)' refs/original | git update-ref --stdin
    # or, for older git versions (e.g. 1.8.3.1) which don't support --stdin
    # git update-ref $(git for-each-ref --format='delete %(refname)' refs/original)
    git reflog expire --expire=now --all
    git gc --aggressive --prune=now