Search code examples
gitgit-rewrite-historybfg-repo-cleaner

git: --prune-empty after using bfg duplicates commits


I am using bfg to remove some subdirectories from a (clone of) git repository:

java -jar bfg-1.12.12.jar --delete-folders {folder1,folder2,folder3} --no-blob-protection myrepo.git/
git reflog expire --expire=now --all && git gc --prune=now --aggressive

This works OK but after I have used bfg I have many empty commits (i.e. commits with nice log messages but without changes because they only touched files that have been removed now).

So as next step I tried to use

git filter-branch --commit-filter 'git_commit_non_empty_tree "$@"' HEAD

or

git filter-branch --prune-empty --tag-name-filter cat -- --all

Both versions do not have the desired effect (removing empty commits).

Instead what I end up with is a repository where (see screenshot below, left is before pruning, right is after):

  1. a few empty commits have been removed
  2. most empty commits remain
  3. non-empty commits are duplicated in separate trains of commits

enter image description here

Any advise?


Solution

  • From the duplicate history you are seeing, it seems likely that your post-BFG-run attempt to remove the old, now-rewritten commit-history has failed. This could occur for a number of reasons, but the main one is if myrepo.git is not a bare/mirror-clone repo as outlined in the BFG instructions.

    Something is retaining the old, pre-BFG rewrite history, which now shows as a duplicate. It is possible or even likely that this history is stored in a remote such as origin, and this would also explain why your filter-branch is not removing all the empty commits you expect.

    Finally, you may be interested in a current pull-request to add the --prune-empty-commits feature to BFG - it works well and as with everything BFG, is orders of magnitude faster than running filter-branch.