Search code examples
gitgit-rewrite-history

(How) can I filter the commit log messages?


I extracted a new module from a git repository using git filter-branch ... (according the instructions I found in a book) repeatedly. Basically I removed many unrelated files (i.e. "everything else"). Now I'm done, but git log has many log messages describing changes to files that are no longer there.

Thus I have a question: Is it possible to "filter away" those log messages that do not affect any of the remaining files? In addition, is it possible to "amend" all the existing log entries to remove text referring to files that are no longer there?

That is the first part would remove complete commit messages (possibly "now empty" commits, too), while the second part would allow to edit log messages that cannot be removed (maybe simular to an interactive rebase).


Solution

  • This is the hard way to do it (while not having received any better proposal):

    1. Start with git rebase --interactive <root-commit> (git wouldn't allow to use an all-zero commit-ID). Replace every pick with reword in the editor. Then, doing the rebase, mark the title of every unrelated commit (i.e. affecting files that are no longer there) with some unique mark like DDD (for "to be deleted"). With more than 300 commits that was quite some work.

    2. When having completed the first rebase, do a second git rebase --interactive <root-commit>, but this time delete all lines being marked with DDD; for example in vi one would use :g/ DDD /d. This step effectively drops unrelated commit messages from the history.

    3. When git log still shows some useless commit messages, you may want to do another round of interactive rebasing, but this time use squash instead of skip where multiple commits may be merged into one. Actually I even did repeat this step before I was satisfied with the result.

    If you assigned a tag before working on the repository and yet another when having finished, the you can check the results with git diff <old-tag> <new-tag>. As rebasing obsoletes many commits, you might consider git gc (possibly after a git fsck) to tidy up the workspace.