Search code examples
pythonpython-3.xhashmd5hashlib

Compare md5 hashes of two files in python


I want to compare hashes of two files. But no matter if files are different or not, even with different hashes comparison results True

Here is the code:

import hashlib

hasher1 = hashlib.md5()
afile1 = open('canvas.png', 'rb')
buf1 = afile1.read()
a = hasher1.update(buf1)
print(str(hasher1.hexdigest()))

hasher2 = hashlib.md5()
afile2 = open('img5.png', 'rb')
buf2 = afile2.read()
b = hasher2.update(buf2)
print(str(hasher2.hexdigest()))

print(str(a) == str(b))

The output:

614c9853a7f62c5b60d7d15bde80708f
76dc116b2c1b19b265db5e657846e649
True

Process finished with exit code 0

Solution

  • As a general rule Python methods follow the principle of command-query separation -- so that methods that modify the object (i.e. commands) return None. This includes, for example, list.sort, and dict.update. It is also true of the hasher1.update method. So

    a = hasher1.update(buf1)
    

    assigns None to a. Instead, use

    hasher1.update(buf1)
    a = hasher1.hexdigest()
    

    and similarly for b.


    import hashlib
    
    digests = []
    for filename in ['canvas.png', 'img5.png']:
        hasher = hashlib.md5()
        with open(filename, 'rb') as f:
            buf = f.read()
            hasher.update(buf)
            a = hasher.hexdigest()
            digests.append(a)
            print(a)
    
    print(digests[0] == digests[1])