algorithmcompressionhuffman-codeconsistencyhuffman-tree# Inconsistencies when creating Huffman tree

On Wikipedia, the construction of a Huffman tree is described as such:

The simplest construction algorithm uses a priority queue where the node with lowest probability is given highest priority:

Create a leaf node for each symbol and add it to the priority queue.

While there is more than one node in the queue:

Remove the two nodes of highest priority (lowest probability) from the queue

Create a new internal node with these two nodes as children and with probability equal to the sum of the two nodes' probabilities.

Add the new node to the queue.

The remaining node is the root node and the tree is complete.

But their example for deconstruction (assigning `0`

and `1`

) is a bit weird:

It's easy to see that except from the leaf node, the frequencies of intermediate nodes don't seem to matter that much? For instance, `a3 > a4`

and `0`

are added to `a3`

while `1`

are added to `a4`

's string. But, `a1 < a2 + a3 + a4`

and the same thing is done. So, *does the frequencies matter*?

The same thing happens in videos from Leios Lab and Reducible (two well-known math/programming Youtubers). And also these Stack question:

Which node go in left or right on addition of weight while huffman tree creation (stating that you should be

**consistent**when putting nodes left/right based on frequencies)What happens if we are inconsistent while creating Huffman Tree? and Huffman Tree Coding (stating that while being

**inconsistent**does change the code after completion, it doesn't matter, as the height stay the same)

So here is my questions:

Does it matter if children nodes are place left/right randomly?

If it matter (or doesn't matter), is there a standard? For example, lower node right, higher node left, lower gets a

`1`

, higher get a`0`

(or whichever combination of these four actions)? It probably doesn't matter with the**efficiency**if you're consistent already (as all the 3 answers above indicate) but is there a common consensus?

For example, this online generator follows the `lower left, higher right, lower 0, higher 1`

rule, but others can follow another rule.

P.S: Noted that the second question is two-part, as the right/left problem is related to construction, while the `0`

/`1`

relates to when you assign value. Normally you would do both in a same `Huffman`

function anyway but I'm creating an animation on this and the order probably does matter visually.

Solution

No. The code will still be optimal. (The accepted answer to the linked question is wrong!)

No. The standard, to the extent that there is one, is to ignore the resulting tree entirely, and use only the resulting code

*lengths*(in bits) for each symbol, and a well-defined lexographic ordering of the symbols, to generate the actual ones and zeros of the codes. This is called a Canonical Huffman code. There are many possible assignments of ones and zeros to the symbol codes, given that the zero and the one can be arbitrarily assigned to left or right for every branch. Using a canonical procedure establishes an agreement between the encoder and decoder as to a single possible code for any given set of lengths and symbols. This not only sets a standard for the zeros and ones, but importantly, since the purpose is compression, reduces the amount of information that needs to be sent from the encoder to the decoder in order to describe the code.

- How to generate uniformly distributed subintervals of an interval?
- Generating random number in a non-power-of-2 range from random bits
- Algorithm - Search and Replace a string
- Looking for a branchless algorithm for converting a specific set of 4-bit integers
- Different results for XIRR between Excel and ExcelFinancialFunctions 3.2.0
- Find four,whose sum equals to target
- A* algorithm can't find the goal in Python
- Efficiently getting all divisors of a given number
- Are there any existed API to split IEnumerable<T> to many Vector<T> in CSharp？
- Generating all divisors of a number given its prime factorization
- Efficient way to insert a number into a sorted array of numbers?
- BFS Maximize Minimum Distance from a Monster along a path
- Is there effective algorithm that will return all different combination?
- Big O of algorithm that steps over array recursively
- Modification of Dijkstra's algorithm to make it work with negative weights and its time complexity
- Traversing/Moving over an unfolded cube
- Fibonacci Recursion using Golden Ratio(Golden Number)
- Karatsuba implementation in C
- Why is O(n) better than O( nlog(n) )?
- What is Sliding Window Algorithm? Examples?
- How to write a function to navigate through a non-binary tree?
- Evenly distribute QDate values into certain number of slots
- Find max product using divide and conqure in O(n) time
- Cache-friendly sqare matrix transposition logic issue
- Why doesn't Dijkstra's algorithm work for negative weight edges?
- Fast Convertion From Adjacency List to Edge List
- I convert ASCII words into numbers but am stuck trying to decode them. How to convert 1=a, 2=b, 28=ab etc? (psudeocode okay)
- Reversing the word order in a string in place
- Trying to make a 2x2 rubik's cube solving algorithm , how do i find the solution path (DFS)?
- question about missing element in array