Search code examples
binarypython-3.xasciimd5

How to get a binary encoding of a string in python?


I am trying to build an md5 cracker for practice. Before I go any further here is my code:

def offline_wordlist_attack(list_path):
      with fileinput.input(files=(list_path)) as wordlist:
          for word in wordlist:
              md5_hash_object = hashlib.md5() # constructing an md5 hash object
              md5_hash_object.update(binascii.a2b_uu(word))
              word_digest = md5_hash_object.digest() # performing the md5 digestion of the word   
              print(word_digest) # Debug

My issue is with md5_hash_object.update(binascii.a2b_uu(word)). The hashlib Python 3 documentation states that the string passed to update() should be in binary representation. The documentation uses m.update(b"Nobody inspects") as an example. In my code, I can not simply attach b in front of the variable word. So I tried to use the binascii library, but that library too, has a note in the documentation stating:

Note

Encoding and decoding functions do not accept Unicode strings. Only bytestring and bytearray objects can be processed.

Could somebody help me out with this? It is getting the better of me.


Solution

  • You need to pass in a bytes object, rather than a str. The typical way to go from str (a unicode string in Python 3) to bytes is to use the .encode() method on the string and specify the encoding you wish to use.

    my_bytes = my_string.encode('utf-8')