Search code examples
pythonpython-3.xasciisha1utf

Any way to generate SHA 1 hash of a sequence of 1s and 0s in Python


I want to generate a random sequence of 1s and 0s and input it into the SHA1 hash calculator in Python.

The hashlib library (doc link) for generating hashes accepts byte-like objects as input in its update() function. I have tried using random.getrandbits(64) to generate a random sequence, but when I try and convert it into bytes using .to_bytes(), it gives the error that the 'utf-8' codec can't decode it.

Code:

x = random.getrandbits(64)
print(x)
print(format(x, 'b'))

binary_int = int(format(x, 'b'), 2)
  
# Getting the byte number
byte_number = (binary_int.bit_length() + 7) // 8
  
# Getting an array of bytes
binary_array = binary_int.to_bytes(byte_number, "big")
  
# Converting the array into ASCII text
ascii_text = binary_array.decode()
  
# Getting the ASCII value
print(ascii_text)

Error:

17659976144931976749
1111010100010100110101101011110010111100100010101111011000101101
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
/var/folders/9s/msn7k8q55yn6t6br55830hc40000gn/T/ipykernel_33103/157314006.py in <module>
     12 
     13 # Converting the array into ASCII text
---> 14 ascii_text = binary_array.decode()
     15 
     16 # Getting the ASCII value

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf5 in position 0: invalid start byte

I realize that the error means that the generated random bit sequence is invalid for UTF-8/ASCII code, but how do I work around that to create valid inputs for the SHA1 function?

I have also tried the suggestion mentioned here to use the "ISO-8859-1" encoding:

binary_int = random.getrandbits(64)

# Getting the byte number
byte_number = (binary_int.bit_length() + 7) // 8
  
# Getting an array of bytes
binary_array = binary_int.to_bytes(byte_number, "big")
  
# Converting the array into ASCII text
text = binary_array.decode(encoding='ISO-8859-1')
  
print(text)

print(type(text))

print(len(text))

import sys
print(sys.getsizeof(text.encode('ISO-8859-1')))

print(hash_sha1(text.encode('ISO-8859-1')))

Output:

¦—u¦9}5É
<class 'str'>
8
41
bc25cb6cb34c2b7c73bbba610e0388386c2e70b2

But sys.getsizeof() prints 81 Bytes for text.encode('ISO-8859-1') and not 64 bits as it is supposed to be.

In the above codes, I try 64 bit data, for testing purposes. But, ultimately, I just want to ensure that I am inputting constant-sized randomly generated 512-bit data into the SHA1 generator. Is there any way for that, I hope so. Thanks.

Edit: made it work, thanks to answer by Drakax

Final code:

import os, hashlib
k = os.urandom(64)
# print random no.
print(k)

# print it in bit format (64 bits)
for byte in k:
    print(f'{byte:0>8b}', end='')
print()

# print the sha1 hash 
print(hashlib.sha1(k).hexdigest())

Solution

  • Have you tried one of those:

    1. UUID

    import uuid
    uuid.uuid4().hex
    

    Doc: http://docs.python.org/2/library/uuid.html

    1.1

    import uuid
    from md5 import md5
    
    print md5(str(uuid.uuid4())).hexdigest()
    

    2. Secrets (Python 3.6+)

    import secrets
    secrets.token_hex(nbytes=16)
    '17adbcf543e851aa9216acc9d7206b96'
    
    secrets.token_urlsafe(16)
    'X7NYIolv893DXLunTzeTIQ'
    
    secrets.token_bytes(128 // 8)
    b'\x0b\xdcA\xc0.\x0e\x87\x9b`\x93\\Ev\x1a|u'
    

    Doc: https://docs.python.org/3/library/secrets.html

    3. binascii (python 2.x and 3.x)

    import os
    import binascii
    print(binascii.hexlify(os.urandom(16)))
    '4a4d443679ed46f7514ad6dbe3733c3d'
    

    Doc: https://docs.python.org/3/library/binascii.html

    4. hashlib

    import os, hashlib
    hashlib.md5(os.urandom(32)).hexdigest()
    

    Doc: https://docs.python.org/3/library/hashlib.html

    Should be enough for now ;)