Search code examples
javascriptfunctional-programmingtreetrie

How to construct an ordered Map as a persistent data structure?


I have a persistent data structure hamt based on a hashed array mapped trie, which is the basis for a couple of more specific persistent data structures, like an immutable array for instance. It provides a rather plain API:

const hamtDel = (hamt, props, k) => {/* implementation */}
const hamtGet = (hamt, k) => {/* implementation */}
const hamtSet = (hamt, props, k, v) => {/* implementation */}
const hamtEmpty = () => {/* implementation */} // creates an empty hamt

hamt, k, v are self-explanatory. props is just a means to add arbitrary properties to the freshly generated hamt object. The immutable array, for instance, has additional length and offset properties to allow efficient cons and snoc operations.

hamt itself is basically an unordered Map. Since ordered Maps are common in Javascript I tried to implement one based on hamt. However, this turns out to be quite difficult. In order to keep track of the insertion order I need a hamt A for the actual key/value pairs as well as one that holds the mapping from number-of-insertion to the corresponding key B.

Given both structures I can access A's elements as usual and traverse A by retrieving the insertions order hold by B. However, when I want to delete an element in A I also need to delete it in B. The keys in B are the number-of-insertions. That means in the worst case I would have to traverse the entire B structure to find the corresponding key.

A third hamt with B's key/value pairs inverted could mitigate the issue, but ending up with three hamts just to obtain an ordered Map seems a poor design choice.

I am pretty sure this problem is well-known and there are solid solutions to solve it. I haven't found anything helpful yet since I lack the proper terminology. Help on this matter is very welcome.


Solution

  • There are a few implementations to take a look at in Scala and Kotlin.

    First up is scala's VectorMap (https://github.com/scala/scala/blob/v2.13.3/src/library/scala/collection/immutable/VectorMap.scala)

    It uses a combination of a Compressed Hash Array-Mapped Trie, also known as CHAMP (an immutable hashmap similar to a HAMT), together with a Vector which is a Radix Balanced Tree to store the keys in order.

    Next there's scala's TreeSeqMap (https://github.com/scala/scala/blob/v2.13.3/src/library/scala/collection/immutable/TreeSeqMap.scala)

    This uses the same CHAMP hashmap as VectorMap, but together with a specialized Trie-Map where keys are like integer indexes, and values are the actual keys of the ordered map.

    Next there's this proposed LinkedHashMap that I wrote a few months ago which is very similar to a mutable double-linked-list backed LinkedHashMap, but splits the linked lists into segments to allow for better persistent updates: https://github.com/scala/scala/pull/8644/files

    Finally there's Kotlin's OrderedMap here: https://github.com/Kotlin/kotlinx.collections.immutable/blob/master/core/commonMain/src/implementations/persistentOrderedMap/PersistentOrderedMap.kt Which is basically the same thing as the previous implementation except the Link segments are all of length 1 (i.e., each key-value is its own segment or in other words there aren't segments but just individual elements which point forward and back by key instead of by pointer).