Search code examples
gitsha1

Why SHA1 generated by GIT for same binary file (image) differs from SHA1 by other tools?


Hy,

Just out of curiosity, I tested SHA1 generated by GIT for an image file "tech01.jpg" however the SHA1 generated by GIT and those by other tools differ.

As far as I understand, SHA1 for the same file, binary content or same text should be same irrespective of the system.

So why does the SHA1 differ generated by GIT than those by other tools. Does GIT uses different encryption algorithm or any other way that modifies the file/text or it just may be I'm missing something in my understanding of GIT usage of SHA1.

This is what I get:

Bash vs other tools SHA1 difference

I am currently using git version 2.13.0.windows.1 on via Bash (MingWWindows 7 64-bit machine if that matters.

Test Image file. Image used in testing


Solution

  • Git isn't calculating the SHA-1 of the file. Each git object, including each file stored in git, has a header that includes information about the object, including the type of object (in this case, a file is a "blob" object) and the size of the object.

    You can calculate Git object ID for a file by running:

    git hash-object tech02.jpg
    

    This will calculate the SHA-1 of the header followed by the contents of the file.