Search code examples
gitobjecthashblob

Why does git log --find-object get two file commits with different content for a given blob?


I am using git log --find-object to identify commits by providing git file blobs (file content hashes).

This works usually fine, I get the blob before for a file by using git hash-object.

However, sometimes for a given blob hash of a file, git log --find-object=<blob> returns two commits for the same file, where the contents of the files of the returned commits definitely differs.

Getting multiple commits where the corresponding files contents is the same I would expect, but having commits reported where the content is not exactly the same seems odd to me (that is based on how I would understand the --find-object option atm).

Why is that? Where would I have to elaborate with the command?


Solution

  • As stated by the documentation (also refer to the -S and -G option to make sense of it) :
    with this option, a commit will be mentioned if the number of occurrences of said object changes.

    So, if you take the blobid of a file in your repo (say, the blobid of file Readme.md)

    git log --find-object=<blobid> will :

    1. report commits where this blobid appears as file Readme.md (that's what you expect),
    2. report commits where that blob disappears as file Readme.md, eg : a commit which changed the content of Readme.md from blobid to something else ;
    3. report commits where this blob appears or disappears at some other path, eg : at some point, file doc/Doc.md had the exact same blobid ;
    4. not report commits where a file with that exact content has been renamed, eg : file doc/Doc.md has been renamed to Readme.md, or from Readme.md to doc/Doc.md

    You can run :

    git ls-tree -r <commit> | grep <blobid>
    # check parent commit too :
    git ls-tree -r <commit>^ | grep <blobid>
    

    to see which <commit> contained that blob, and at what path.

    If you want to check what modified the precise path Readme.md, you can add it as a filter to git log :

    git log --find-object=blobid -- Readme.md
    

    This will get rid of cases 3. and 4. above.
    You would still see commits where the content you look for is in the parent commit (case 2. above).