Search code examples
data-structuresvan-emde-boas-trees

What prevents Van Emde Boas trees from being more popular in real-world applications?


We know balanced trees perform insertion, deletion, and search in O(log n)-time, examples include

  • Red-Black
  • AVL
  • Splay
  • B-tree (and its variants).

However, when keys are integers in some limited range, it is possible to use a Van Emde Boas tree to drop these operations down to O(log(log n))-time, i.e., exponentially better than AVL or RB trees. Well, this is actually the case of many real world applications.

I see lots of applications for this. One I'd like to cite is on databases, for which creating indexes basically involves choosing between a Hash or a B*-tree. If a Van Emde Boas tree was implemented, it would provide a halfway between these two options, theoretically improving many query optimization problems.

Why Van Emde Boas tree is not widely used as say Red-Black or B-tree since

  • it's not a novelty (it was invented in 1975)
  • easy to implement
  • way faster than other trees

and what are the considerations about it?


Solution

  • The asymptotic complexity is sometimes misleading. In the case for Van Emde Boas tree the constant is quite large see here. I quote:

    However, for small trees the overhead associated with vEB trees
    is enormous: on the order of 2^(m/2)
    

    There are also other cases where an algorithm with better complexity exists but it only gets better for an input so big that in practice it is almost never used e.g. linear time static RMQ.