I use the following code to compute the binary representation of MD5 hashcode.
MD5 is always 128 bytes, and bin
returns a string starting with "0b". Therefore, the length of md5_bin
must always be 130, but when I run the program, it varies between 128 and 130, on different values of random_str
.
md5_bin = bin(int(hashlib.md5(random_str).hexdigest(),16))`
print len(md5_bin)
Sure, MD5 is always 128 bytes, but sometimes the first byte is a 0, and occasionally the second byte is too.
Think of it this way: the decimal string '15'
and the decimal '0015'
are both the same number 15
. When you ask Python to convert the int
15
to a string, you're going to get '15'
, not '0015
'. It has no way of knowing that you wanted 4 digits instead of 2:
>>> n = int('0015')
>>> str(n)
'15'
And it's the same with bin
. It has no way of knowing that you wanted 128 bits instead of 126. You gave it a number with 126 bits, so it gives you 126 binary digits.
But you can tell it you want that, e.g., with a format spec:
bits = format(md5_bin, '0128b')
… or, equivalently:
bits = '{:0128b}'.format(md5_bin)
If you want the 0b
prefix, you can add that:
bits = format(md5_bin, '#0128b')
bits = '{md5_bin:#0128b}'.format(md5_bin)
bits = '0b{md5_bin:0128b}'.format(md5_bin)