I was reading about skip lists and MemSQL and was wondering why skip lists are not more widely used in databases? Are there any major disadvatages to using skiplists?
Databases are typically so huge that they have to be stored in external memory, such as a giant disk drive. As a result, the bottleneck in most database applications is the number of times that we have to do a memory transfer from the disk drive into main memory.
B-trees and their variants are specifically designed to minimize the number of block reads and writes necessary to perform each of their operations. Mathematically, the number of memory transfers required for each B-tree operation is O(log n / log B), where B is the block size. Compare this to a skiplist, which requires O(log n) memory transfers on expectation. Since B is usually measured in megabytes, log B can be in the neighborhood of 15 - 25, so the B-tree can be significantly faster. Even when the database is in main memory, the effect of the memory hierarchy (L1 and L2 caches, etc.) can be so pronounced that B-tree variants are still faster in practice than many other data structures. This Google blog post gives some background on this.
Although each operation on a B-tree typically requires more CPU work than corresponding operations in other data structures, the fact that they require so few memory transfers tends to make them significantly faster in practice than other data structures. Therefore, it would not be advisable to use a skip list in a database.
There's another reason B-trees are nice: they're worst-case efficient. Although deterministic skip lists do exist, most skiplist implementations are randomized and give expected guarantees on their behavior. In a database, this might be unacceptable because many use cases on databases require worst-case efficient behavior.
Hope this helps!