Search code examples
pythonpython-3.xhashlib

How to hash a variable in Python?


This example works fine example:

import hashlib
m = hashlib.md5()
m.update(b"Nobody inspects")
r= m.digest()
print(r)

Now, I want to do the same thing but with a variable: var= "hash me this text, please". How could I do it following the same logic of the example ?


Solution

  • The hash.update() method requires bytes, always.

    Encode unicode text to bytes first; what you encode to is a application decision, but if all you want to do is fingerprint text for then UTF-8 is a great choice:

    m.update(var.encode('utf8')) 
    

    The exception you get when you don't is quite clear however:

    >>> import hashlib
    >>> hashlib.md5().update('foo')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: Unicode-objects must be encoded before hashing
    

    If you are getting the hash of a file, open the file in binary mode instead:

    from functools import partial
    
    hash = hashlib.md5()
    with open(filename, 'rb') as binfile:
        for chunk in iter(binfile, partial(binfile.read, 2048)):
            hash.update(chunk)
    print hash.hexdigest()