Search code examples
algorithmdata-structurestreebinary-search-tree

Splay tree real life applications


Where would you use splay-tree in production. I mean a REAL LIFE example.

I was thinking about implementing autocomplete using tries and splay trees. For a large dataset it's not a good idea to traverse through trie from node x to the leaves to return results, so the idea was of having a splay tree inside a node in trie, so when user entered 'sta' it will go to s-t-a, 'a' - node and then return the top 5 elements in the splay tree (by BFS/level traversing, which doesn't necessarily mutates/modifies the tree)

Of course after the autocomplete variant was picked, we should traverse up the trie and update all splay trees inside those nodes.

Since splay trees are sensitive in concurrent environments I was questioning its' usage in production

Your ideas?


Solution

  • Splay trees are not a good match for data which rarely or never changes, particularly in a threaded environment. The extra mutations during read operations defeat memory caches and can create unnecessary lock contention. In any case, for read-only data structures, you can do a one-time computation of an optimal tree. Even if that computation is slow, it will have no impact on the long-term execution time.

    I'm not entirely persuaded by the claim that large tries are slow, and certainly not in the case of autocompleters. On even not-so-modern hardware, the cost of a trie traversal is trivial compared to the time it takes for the user to type a character, or even the time it takes for the underlying keyboard driver and input processor to deliver the keypress to your application.

    If you really need to optimise a trie, there is good reason to believe that a hybrid data structure with a trie at the root combined with a linear (or binary) search once the alternatives can fit in a cache line. This maximizes the benefit of the trie's large fan-out while avoiding the poor caching behaviour and excessive storage overhead at the end of the lines.

    Splay trees are most useful (if they are useful at all) on data structures which are modified frequently. The ckassic example is a "rope" data structure (a tree of string segments), which is one way to attempt to optimise a text editor by avoiding large string copies. Compared with a deterministic tree-balancing algorithm such as RB-trees, the splay tree algorithm has the benefit of simplicity, as well as only touching nodes which form part of the tree traversal.

    However, the ready availability of self-balancing tree libraries (part of the standard libraries of many modern programming languages) combined with often-disappointing empirical results make the splay algorithm a niche product at best, although it is certainly a fascinating idea.