Search code examples
graphcompressionsymbolsentropyinformation-theory

What is the information entropy of a node in a symbolic graph


I'm attempting to develop a better intuition of information entropy. The example case I would like to look at is a symbolic graph where the leaves are more specific entity types than the root.

If each node in the graph can be recorded using the same number of bits eg.. a 64 bit id handle.. would the more specific 'Jack Pine' node be considered to be higher or lower entropy than the 'Tree' node?

Symbolic Graph

Update 12 hours after post: Perhaps this concept can't be understood with entropy.. Assume the symbol 'jack pine' has more information in it than symbol 'tree' yet can be transmitted using the same number of bits - in this case the observer receiving the symbol is decompressing the information in their mind based on previous knowledge.. so receiving the symbol for 'jack pine' is giving them more information than receiving the symbol 'tree' because they understand a specific type of tree. Does this mean 'jack pine' is lower entropy because it has more information for a given signal and thus higher compression?


Solution

  • If I assume that each branch has equal probability, then going from tree to a kind of tree adds one bit of entropy. Going from a pine tree to a kind of pine tree adds another bit of entropy. Knowing that it is a Jack Pine has two bits more entropy than just knowing that it is a Tree.