Search code examples
pythonimagecompressionhuffman-code

Is it possible build a huffman algorithm with characters of '0' and '1'?


I have an array with this format:

P1
5 5
0 0 0 0 0
0 1 0 1 0
1 0 1 0 1
1 1 1 1 1
0 0 0 1 1

and I want to compress this array in order using huffman. My question is if it's possible because when I search by huffman algorithm I only find with multiple characters and didn't nothing similar to this.

If so possible, how would I make it?

I've tried this:

def getBitString(file):
   file = open(ficheiro, "rb").read()
   string = ""
   for byte in file:
     print(byte)
     string+=format(byte,'08b')

   return string

I only converted the file to binary. I'm thinking about the problem to progress but I'm stuck I need some help


Solution

  • The Huffman encoding uses the probability of characters from an alphabet inside a text to safe space when encoding. For it to work you will need an to have an alphabet with more than two characters. Two characters would need 1 bit to store and 1 bit when encoded. That would not safe anything.

    With your example data you could see every row as a characters. But there is an other requirement for the Huffman encoding to be helpful: the characters have to repeat, the more the better. There would be only 5 non-repeating characters.

    In your simple case of encoding 25 bits Huffman encoding is not going to be helpful. With a lot more repeating data it might be.