Search code examples
gitgarbage-collectiongit-rewrite-history

Pruning old Git commits without rebasing


Suppose I have a Git repository with huge trees (~60 GiB) and some history, where old versions contain many deleted files.

I now want to prune old history, but without rebaseing all the commits after the prune point, because that would take several hours for each commit.

  • Can I just delete the first commit object to remove, and hope for git gc to delete all (now unreferenced) older ones? Or will this cause panic because of missing objects?

  • Can I use git replace to replace the first commit I want to remove with a dummy commit and then call git gc?

  • Is there some other method to remove my old commits in-place?


Solution

  • without rebasing all the commits after the prune point, because that would take several hours for each commit.

    After Git 2.18 (Q2 2018)

    Since Git 2.18, graft has been superseded by git ref/replace/): see "What are .git/info/grafts for?".

    As noted by in the comments, you would use instead
    git replace --convert-graft-file :

    Creates graft commits for all entries in $GIT_DIR/info/grafts and deletes that file upon success.
    The purpose is to help users with transitioning off of the now-deprecated graft file.

    So after git rev-parse HEAD~100 > .git/info/grafts.

    And git filter-branch or BFG are obsolete after Git 2.22

    Install git filter-repo and use git filter-repo --force


    Before Git 2.18 (Q2 2018):

    That is what graft point is for (better in that particular case than git replace, as I detail here)

    The file .git/info/grafts with only one line with a commit id, says that the commit doesn't have a parent.
    To keep the last 100 commits, using git rev-parse:

     git rev-parse HEAD~100 > .git/info/grafts
    

    Then:

     git filter-branch -- --all
    

    Finally:

    rm -Rf .git/refs/original
    

    Then you can prune the rest:

    git reflog expire --expire=now --all
    git gc --prune=now
    git gc --aggressive --prune=now
    git repack -Ad      # kills in-pack garbage
    git prune           # kills loose garbage