Search code examples
gitversion-controlblame

How does git blame determine who edited a line of a file?


Using

git blame file

will show all info about each line, for example who add this line in which commit, and when, but as far as I know, Git will add a completely new object every time you change a file. So where does Git store such info about each line?


Solution

  • As @mvp said, "it doesn't". To answer your comment though—that is, "the flow of this process"—it works very roughly like a series of git diffs, starting with the most recent version of the file, and working backwards until every line has an assigned origin.

    Suppose you have a short file with just four lines, and it is the most recent (i.e., the version in HEAD). Suppose further that git diff shows that in revision HEAD~1 there were only the first three lines, and I added that last (fourth) line. Then the "blame" for line 4 would be mine: it was not there in the previous version, and it was there in the current, so I must have added it.

    That leaves the problem of figuring out who to "blame" for those three lines. So now git must diff HEAD~1 against HEAD~2. If those three lines all appear exactly as is in HEAD~2—which might be the case if, for instance, the change from HEAD~2 to HEAD~1 was simply to delete some lines—then we must keep going further back in history.

    At some point, though, git diff will show that someone added line 1, line 2, and/or line 3 (in some previous version), possibly while deleting some other line(s); or in the worst case, git will reach a "root commit": a commit with no parents. In any case, whoever committed the commit(s) that caused those lines to appear, must be the one to blame.