Search code examples
pythonbinarybinaryfileshuffman-code

How to convert 0's and 1's to binary and back for a Huffman algorithm?


Currently, I am writing a Huffman Algorithm but I have a problem with converting the binary part.

The rest of the program is already working. The program can create a tree from the symbols and can create a string of 0's and 1's which represent the symbols. But now I want to convert this string to a binary format and convert it back again. Currently, I am using this code to convert the string to binary.

def toBytes(data):
    b = bytearray()

    for i in range(0, len(data), 8):
        b.append(int(data[i:i+8]), 2)

    return bytes(b)

I can convert this string to a binary format but can't convert it back. For example, when I insert "01111101011000" to the function it returns b'}\x18'. How can I convert this binary format back to my 0's and 1's?


Solution

  • You can write a bytes-to-binarylike-string method by making use of two observations:

    • You can use str.format's b type specifier to turn an integer into an equivalent string of ones and zeroes.
    • A bytes object can be treated just like a list of integers when you're iterating over it.

     

    >>> def to_bin(b):
    ...     return "".join("{:08b}".format(x) for x in b)
    ...
    >>> b = b'}\x18'
    >>> print(to_bin(b))
    0111110100011000