Search code examples
stringalgorithmdata-structuresparallel-processingtrie

Parallel algorithm for constructing a trie?


Because the trie data structure has such a huge branching factor and each subtree is totally independent of the others, it seems like there ought to be a way to enormously speed up trie construction for a given dictionary by adding all the words in parallel.

My initial idea on how to do this was the following: associate a mutex with each pointer in the trie (including the pointer to the root) and then have each thread follow the normal algorithm for inserting a word into the trie. However, before following any pointers, a thread must first acquire the lock on that pointer so that if it needs to add a new child node to the trie, it can do so without introducing any data races.

The catch with this approach is that it uses an enormous number of locks - one for each pointer in the trie - and does an enormous number of acquires and releases - one for each character in each input string.

Is there a way to build a trie in parallel without using nearly as many locks?


Solution

  • An obvious lock-free algorithm would be:

    1. Bucket-sort the input strings by a length-k prefix (ordinarily, k=1, but with small alphabets, increase k).
    2. For each letter, construct a trie containing the k-suffix of all the strings starting with that letter.
    3. Merge the tries from the previous step (when k=1, just add a root node).

    Assuming a uniform distribution of prefixes, this can give you a linear speedup up to the size of the alphabet to the power k.