Search code examples
pythonhashbinaryfiles

Method for reading bytes into an hexdigest string


I'm reading 16 bytes from a binary buffer named raw:

md5 = list(struct.unpack('16B', raw.read(16)))

This produces the following list:

>>> print(md5)
>>> [25, 94, 158, 89, 108, 25, 125, 20, 138, 164, 84, 137, 250, 82, 150, 202]

I need to build a proper md5 string that I can then use to compare with any hexdigest() from hashlib.md5()

Currently I'm doing it this way:

md5 = list(struct.unpack('16B', raw.read(16)))
for i, b in enumerate(md5):
    md5[i] = hex(b)
md5 = ''.join(md5).replace('0x', '')

This works, but I cannot help but feel I'm missing something. Is there a more straightforward conversion between the data in the buffer and the final string?

Note: I understand I have other types of digests. But currently I'm interested in solving the problem for an hexadecimal digest.


Solution

  • You can use hexlify to convert a bytes (Python 3) / binary str (Python 2) into a hex string (the string will be bytes on Python 3, so we need .decode('ascii') there to match against the hexdigest that is a str).

    from binascii import hexlify
    hex_string = hexlify(raw.read(16)).decode('ascii')
    
    if md5.hexdigest() == hex_string:
        ...
    

    Likewise you can compare the raw bytes with the digest(); the hexdigest() is just 32-character readable representation of the 16-byte value that is the actual MD5 digest sum.

    the_bytes = raw.read(16)
    if md5.digest() == the_bytes:
        ...