As the title stated, I'm writing a function to compute Huffman codes for symbols in a tree, but I feel completely lost.
A branch looks like this:
{:kind :branch, :frequency frequency, :left child0, :right child1}
A leaf looks like this:
{:kind :leaf, :frequency frequency, :value symbol}
And the code itself is structured like this:
{:tree tree, :length length, :bits bits}
I have the main function already (looks like this):
(defn huffman-codes
"Given a Huffman tree, compute the Huffman codes for each symbol in it.
Returns a map mapping each symbol to a sequence of bits (0 or 1)."
[T]
(into {} (for [s (all-symbols T)] [s (find-in-tree s T '())])
)
)
all-symbols
return the set of all symbols in the tree and I am to write a helper function find-in-tree
that finds the bit string of a symbol
EDIT: I've tried this now and it gets me closer to what I want, but still not right (see error message below)
(defn find-in-tree
[s T l]
(if (isleaf? T)
{(:value T) l}
(merge (find-in-tree s (:left T) (concat l '(0)))
(find-in-tree s (:right T) (concat l '(1)))
)
)
)
ERROR -- got' {:c {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :b {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :d {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :a {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}} ', expected ' {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)} '
It gets all the correct bit strings but assigns the whole map to every value, and I don't know what's wrong.
Assuming that your Huffman tree is valid (meaning we can ignore :frequency
), and that 0
means 'left' and 1
means 'right':
(defn code-map
"Given a Huffman tree, returns a map expressing each symbol's code"
[{:keys [kind left right value]} code]
(if (= kind :leaf)
{value code}
(merge (code-map left (str code "0"))
(code-map right (str code "1")))))
Demo:
;; sample tree
(def root
{:kind :branch
:left {:kind :branch
:left {:kind :leaf
:value "X"}
:right {:kind :leaf
:value "Y"}}
:right {:kind :leaf :value "Z"}})
;; make sure to pass it "" as second arg
(code-map root "")
;;=> {"X" "00", "Y" "01", "Z" "1"}
To clean this up, you could move the ""
arg into an inner helper function, and the recursion could be made TCO-able.