Search code examples
gitgit-blame

Speed up `git blame` on repository with many commits


I am trying to git blame the following file (run on my local machine) as it is too slow to generate the blame of GitHub:

https://github.com/Homebrew/homebrew-core/blob/master/Formula/sqlite.rb

But it is also very slow to run locally, over a minute on my machine as measured by

time git --no-pager blame Formula/sqlite.rb > /dev/null

The repository contains over 150K commits.

Is there a way to speed up the git blame command?


Solution

  • The homebrew-core repository is rather large by Git standards. A 250 MB repository, 150,000 commits for 4000 "Formulas". This can affect performance. Github is indeed having trouble with it.

    enter image description here

    git blame Formula/sqlite.rb takes about 45 seconds on my 2018 i7 Macbook with Git 2.22.0. Slow by Git standards, but acceptable given how often one runs git blame.

    As a user of this repository, there isn't much to be done. git blame must search backwards through every commit to see which ones alter this file. And unfortunately git blame does not appear to take advantage of parallel processing.

    There are some options...

    1. Contact Github about the problem and hope they can address it.
    2. Restrict how far back you look in history: git blame --since=1.year -- Formula/sqlite.rb
    3. Reconsider whatever process that requires speedy git blame on this repo.
    4. Cache the result.