Search code examples
indexingvoltdb

Why VoltDB choose Red-Black Tree as index structure?


As I can see from the source code:

enum TableIndexType {
    BALANCED_TREE_INDEX     = 1,
    HASH_TABLE_INDEX        = 2,
    BTREE_INDEX             = 3, // unused
    COVERING_CELL_INDEX     = 4
};

VoltDB mainly use BALANCED_TREE_INDEX as index structure, which use CompactingMap(an Red-Black Tree implementation) internally.

Comparing to b+tree, when do range query by index, using Red-Black Tree will lose spatial locality.


Solution

  • The primary reason VoltDB chose a Red-Black Tree implementation was to avoid memory fragmentation and to remain balanced. Performance needs to be consistent over the lifetime of the data, rather than using less flexible structures that would be faster in some cases, but would degrade performance or become less optimal over time.

    Records stored in VoltDB tables may contain various datatypes and numbers of columns, and they may persist in memory for various amounts of times. Indexes often contain multiple columns. With VARCHAR columns, values < 64 bytes are stored within the index, but for larger values the index uses a pointer to the one copy of the value in pooled memory.

    You can read a bit more on the reasoning here:

    Disclosure: I work at VoltDB.