Search code examples
mathclojurelisp

Given a Huffman tree, how to compute Huffman code for each symbol?


As the title stated, I'm writing a function to compute Huffman codes for symbols in a tree, but I feel completely lost.

A branch looks like this:

{:kind :branch, :frequency frequency, :left child0, :right child1}

A leaf looks like this:

{:kind :leaf, :frequency frequency, :value symbol}

And the code itself is structured like this:

{:tree tree, :length length, :bits bits}

I have the main function already (looks like this):

 (defn huffman-codes
    "Given a Huffman tree, compute the Huffman codes for each symbol in it. 
    Returns a map mapping each symbol to a sequence of bits (0 or 1)." 
   [T]

   (into {} (for [s (all-symbols T)] [s (find-in-tree s T '())])

 ) 
 )

all-symbols return the set of all symbols in the tree and I am to write a helper function find-in-tree that finds the bit string of a symbol

EDIT: I've tried this now and it gets me closer to what I want, but still not right (see error message below)

    (defn find-in-tree 
       [s T l]
    
       (if (isleaf? T)
           {(:value T) l}
           (merge (find-in-tree s (:left T) (concat l '(0)))
              (find-in-tree s (:right T) (concat l '(1)))
        )
       )
    
    )

ERROR -- got' {:c {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :b {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :d {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}, :a {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)}} ', expected ' {:d (0 0 0), :c (0 0 1), :b (0 1), :a (1)} '

It gets all the correct bit strings but assigns the whole map to every value, and I don't know what's wrong.


Solution

  • Assuming that your Huffman tree is valid (meaning we can ignore :frequency), and that 0 means 'left' and 1 means 'right':

    (defn code-map
      "Given a Huffman tree, returns a map expressing each symbol's code"
      [{:keys [kind left right value]} code]
      (if (= kind :leaf)
        {value code}
        (merge (code-map left (str code "0"))
               (code-map right (str code "1")))))
    

    Demo:

    ;; sample tree
    (def root 
      {:kind :branch 
       :left {:kind :branch
              :left {:kind :leaf
                     :value "X"}
              :right {:kind :leaf
                      :value "Y"}}
       :right {:kind :leaf :value "Z"}})
    
    ;; make sure to pass it "" as second arg
    (code-map root "")
    ;;=> {"X" "00", "Y" "01", "Z" "1"}
    

    To clean this up, you could move the "" arg into an inner helper function, and the recursion could be made TCO-able.