Search code examples
clojurerecordtree-traversal

getting a flat record data structure out of a nested list in Clojure


Let's say I have a list of lists representing a tree-structure in clojure, like

'(a (b (c d)) (e (f)))

and I want to translate it into a record format like this (for the purpose of passing it off to a visualization package):

[{:id "0" :label "a" :parent nil}
 {:id "1" :label "b" :parent "0"}
 {:id "2" :label "c" :parent "1"}
 {:id "3" :label "d" :parent "1"}
 {:id "4" :label "e" :parent "0"}
 {:id "5" :label "f" :parent "4"}]

What's the right way to go about this? I am pretty shaky with this, but I would think of starting with defrecord, and then some way of looping through the tree, but I don't know how to get started.

(def tree '(a (b (c d)) (e (f))))
(defn list-to-record [l]
  (defrecord rec [id name parent])
  (let [counter (atom 0)]
  (into [] (map ->rec 
                      ... ... ...))))

(list-to-record tree)

Perhaps I should be using clojure.walk?


Edit: to clarify, this should work regardless of what the labels are, so changing the labels in the input list shouldn't do anything to the resulting structure (the :parent values for each :id). That is, the following list, just as above, but with the labels all the same as each other

'(a (a (a a)) (a (a)))

should get translated into

[{:id "0" :label "a" :parent nil}
 {:id "1" :label "a" :parent "0"}
 {:id "2" :label "a" :parent "1"}
 {:id "3" :label "a" :parent "1"}
 {:id "4" :label "a" :parent "0"}
 {:id "5" :label "a" :parent "4"}]

Solution

  • Here's a way to do it with Clojure zippers and loop + recur:

    (defn index-zipper [z]
      (loop [loc z, next-id 0, parent-ids [], acc []]
        (cond
          (z/end? loc) acc
    
          (and (z/node loc) (not (z/branch? loc)))
          (recur
            (z/next loc)
            (inc next-id)
            (cond
              (some-> (z/right loc) z/branch?) (conj parent-ids next-id)
              (not (z/right loc)) (some-> parent-ids not-empty pop)
              :else parent-ids)
            (conj acc
                  {:id     (str next-id)
                   :label  (str (z/node loc))
                   :parent (when (seq parent-ids)
                             (str (peek parent-ids)))}))
    
          :else
          (recur (z/next loc) next-id parent-ids acc))))
    

    The loop has bindings for:

    1. Current zipper location
    2. Next :id value, to be incremented each time we see a leaf node
    3. Stack (vector) of current parent :ids, which will be pushed/popped as the zipper descends/ascends. The :parent value for each leaf node is on the top of the parent-ids stack.
    4. accumulator vector for leaf node maps

    You can call the function with a zipper:

    (index-zipper (z/seq-zip '(a (b (c d)) (e (f)))))
    =>
    [{:id "0", :label "a", :parent nil}
     {:id "1", :label "b", :parent "0"}
     {:id "2", :label "c", :parent "1"}
     {:id "3", :label "d", :parent "1"}
     {:id "4", :label "e", :parent "0"}
     {:id "5", :label "f", :parent "4"}]
    
    (index-zipper (z/seq-zip '(a (a (a a)) (a (a)))))
    =>
    [{:id "0", :label "a", :parent nil}
     {:id "1", :label "a", :parent "0"}
     {:id "2", :label "a", :parent "1"}
     {:id "3", :label "a", :parent "1"}
     {:id "4", :label "a", :parent "0"}
     {:id "5", :label "a", :parent "4"}]