Search code examples
githashsha

Difference in SHA Hash between git hash-object & git hash-object -t


I want to calculate git SHA hash without using the git hash object function, that is I want to use the shasum function to calculate it.

I know that for the following case

    body="tree 491e9405120eaaa57cce3938fd38508d85a5e08b 
parent 8550f0f1a7e32fb1bb0933b0294c116a8dbe6dce 
author user <me@example.com> 1390718030 +0000 
committer user <me@example.com> 1390718030 +0000 
This is a test"

echo $body | git hash-object -w --stdin #755481b921f13bcfd731d74287a0c5847519ee81

l=`expr ${#body} + 1`
echo -e 'blob $l\0$body' | shasum #755481b921f13bcfd731d74287a0c5847519ee81

hashes are the same. But if I use -t commit option in hash-object I get a different Hash. How can I calculate the commit hash using shasum?

git hash-object -t commit --stdin <<< "$body" #b4c45adbbe35d3d3c73de48d039a8e3038f5ec54

Solution

  • You changed the type of the object you wrote the hash with.
    From git hash-object

    -t <type>
    
        Specify the type (default: "blob").
    

    You went from the default blob to commit.

    And the object actually written start with the object type, which is part of what the sha1 has to compute.
    See:

    Git calculates the SHA1 for a file (or, in Git terms, a "blob"):

    sha1("blob " + filesize + "\0" + data)
    

    That changes the content of what is taken into account by the sha1.
    With -t commit, you modify that prefix (it is no longer 'blob'), and since the content is different, the sha1 is also different.

    You can do a:

    python -c "import zlib,sys;print repr(zlib.decompress(sys.stdin.read()))" < .git/objects/02/b365d4af3ef6f74b0b1f18c41507c82b3ee571: 
    

    The first word will be the type of the content

    For further reference check How Git Works