Search code examples
gitperformancegarbage-collection

Is it possible to get `git gc` to pack reflog objects?


As hinted by answer https://stackoverflow.com/a/32025729 I've configured remote bare repo with

git config gc.pruneExpire never
git config gc.reflogExpire never

and as a result, I always keep all stored commits of all branches and tags even if I do not create permanent branch names or tags for all those commits.

However, this will cause following warning to appear in long run:

warning: There are too many unreachable loose objects; run 'git prune' to remove them.

This is caused by the fact that the dangling commits that I want to keep forever are always stored as loose objects.

Is there a nice way to get git to include dangling commits referenced only by reflog in pack files on purpose? That would allow performance to keep high and not lose unnamed history.

I know that I can workaround the warning by setting gc.auto to some really high number but that will cause (minor?) performance problem in the long run.


Solution

  • Git is supposed to pack reflog-only-reachable objects. So what you describe is not supposed to happen.

    What could be happening is that you haven’t enabled the reflog in the bare repository (it is by default not enabled). [1] To enable the reflog in a repository:

    git config core.logAllRefUpdates always
    

    OP asked about this on the Git mailing list. Jeff King respondend.

    That's not what's supposed to happen. A normal git-gc (or directly running the "git repack" it spawns) should consider objects in reflogs reachable, and pack them as it would an object reachable from a ref. This has been the case since 63049292e0 (Teach git-repack to preserve objects referred to by reflog entries., 2006-12-18).

    Just to double check: are you sure you have reflogs? They're not enabled by default in bare repos.

    Notes

    1. There’s also cruft packs, a pack type for unreachable objects. But that didn’t exist in 2019.