Search code examples
data-structuresclojurerandom-accesszipper

Easy tree traversal and fast random node access


Edited after Alex Taggart's remark below.

I am using a zipper to easily traverse and edit a tree which can grow to many thousands of nodes. Each node is incomplete when it is first created. Data is going to be added/removed all the time in random positions, leaf nodes are going to be replaced by branches, etc.

The tree can be very unbalanced. Fast random access to a node is also important.

An implementation would be to traverse the tree using a zipper and create a hash table of the nodes indexed by key. Needless to say the above would be very inefficient as:

  • 2 copies of each node need to be created
  • any changes need to be consistently mirrored between the 2 data structures (tree and hashmap).

In short, is there a time/space efficient way to combine the easiness of traversing/updating with a zipper and the fast access of a hash table in clojure?


Solution

  • Clojure's data structures are persistent and use structural sharing. This means that operations like adding, removing or accumulating are not as inefficient as you describe. The memory cost will be minimal since you are not duplicating what's already there.

    By default Clojure's data structures are immutable. The nodes in your tree like structure will thus not update themselves unless you use some sort of reference type (like a Var). I don't know enough about your specific use case to advice on the best way to access nodes. One way to access nodes in a nested structure is the get-in function where you supply the path to the node to return its value.

    Hope this helps solving your problem.