I've been trying to implement an Huffman encoding algorithm based on numeric digits. I've done the part to construct an Huffman tree. But the recursion algorithm doesn't work as expected. It should return the paths in the tree from the root to the designated node, but it always return the wrong path.
The strange thing is, the code seems to be doing what is correct, and it can find the real path. But the result it returns is always something else.
def get_encoding_for(symbol_p, node, encoding):
global encoded_string
result = encoding
if not isinstance(node, list):
if symbol_p == node:
encoded_string = result
return result
else:
for n in node:
index = str(node.index(n))
# print("Node {} with index {}".format(n, index))
result = get_encoding_for(symbol_p, n, encoding + index)
return result
The tree is structured using simply lists.
The huffman tree looks like: [[3.0, [1.0, 2.0]], [4.0, 5.0]]
This is an example output of a simple tree using elements 1,2,3,4,5.
Loop 1.0. Node 1.0 -> Coding 010:
Loop 2.0. Node 2.0 -> Coding 011:
Loop 3.0. Node 3.0 -> Coding 00:
Loop 4.0. Node 4.0 -> Coding 10:
Loop 5.0. Node 5.0 -> Coding 11:
This is what I want the function to return, but it only returns me "11" in all the five iterations. I had to use a global variable to "intercept" the correct coding and I'm not happy with this... I think the problem is on the returning. I've tried a lot of ways of returning but none of them worked.
Can someone tells me what's wrong about the recursion? Thank you so much!
A different approach (from those in my first answer) is to separate generating paths from the stopping logic.
def paths(bintree):
if isinstance(bintree, list):
for i, p in ((0,'0'), (1,'1')):
for val, path in paths(bintree[i]):
yield (val, p + path)
else:
yield (bintree, '')
def encoding2(i, bintree):
for val, path in paths(hufftree):
if i == val:
return path
return ''
# Test
hufftree = [[3, [1, 2]], [4, 5]]
for v in paths(hufftree): print(v)
for i in range(6):
print(encoding2(i, hufftree))
# Prints
010
011
00
10
11
At this point, one might considered creating a dict mapping values to encodings.
huffdict = dict(paths(hufftree))
for i in range(6):
print(huffdict.get(i, ''))
# Prints same as above