Search code examples
pythonpython-3.xhashlib

How do I sha256 raw bits in python?


h = '''0011001010000101011111010000101111111111101000001001000001001010110100010101111001001011000100111100011110001001111011110111011010010100110011001110111001100010111011010010101101010011110100100110101111110001100101011001000110100010000110110001100101110001'''

a = binascii.hexlify(hashlib.sha256(bytes(h,'utf-8')).digest()).decode()

>>> a
44f6dafa3d7a1720b5ebbf2adc1663df4dab03776eed48d2cda775237a547e59

So I have a string which represents some binary data. After writing the code above, I realised that this method was outputting the sha256 of the ascii of the string. I instead want it to output the sha256 of the raw bits similar to:

$ echo 0011001010000101011111010000101111111111101000001001000001001010110100010101111001001011000100111100011110001001111011110111011010010100110011001110111001100010111011010010101101010011110100100110101111110001100101011001000110100010000110110001100101110001 | shasum -a 256 -0

So the sha256 should be

>>> a
f3f06d74b794b20645460aa0b17d4e7a77eaaea283ee55344adbfcece4a63432

Every way I've tried it gives me errors, and I can't seem to find the answer online.

Anyone know how it's done?


Solution

  • import binascii
    import hashlib
    s = '''0011001010000101011111010000101111111111101000001001000001001010110100010101111001001011000100111100011110001001111011110111011010010100110011001110111001100010111011010010101101010011110100100110101111110001100101011001000110100010000110110001100101110001'''
    h=int(s, 2).to_bytes((len(s) + 7) // 8, byteorder='big')
    a = binascii.hexlify(hashlib.sha256(h).digest()).decode()
    

    Output:

    'f3f06d74b794b20645460aa0b17d4e7a77eaaea283ee55344adbfcece4a63432'
    

    [int.to_bytes]2 Return an array of bytes representing an integer. If byteorder is "big", the most significant byte is at the beginning of the byte array. If byteorder is "little", the most significant byte is at the end of the byte array.

    hash.digest() and hash.hexdigest() are similar except the digest is returned as a string object of double length, containing only hexadecimal digits in the latter case