Search code examples
key-valueberkeley-db

Which access method shall be used for a Berkeley DB that it is going to store 15.000.000 of integer keys?


I am planning to evaluate BerkeleyDB for a project where I have to store 15.000.000 of key/value pairs.

Keys are integers of 10 digits. Values are variable lenght binary data.

In the BerkeleyDB documentation (https://web.stanford.edu/class/cs276a/projects/docs/berkeleydb/ref/am_conf/intro.html) it is said that there are four access methods that can be configured:

  1. Btree
  2. Hash
  3. Queue
  4. Recno

While the documentation describes each access method, I can not fully understand which access method would fit better for this specific data set I need to store.

Which access method shall be used for this kind of data?


Solution

  • When unsure, choose btree. It's the most flexible access method. Sure, if you're positive that your application fits in one of the other ones, go for it.

    A note of caution: writing an application using BDB that really works, that's transactional, recoverable, and offers consistency guarantees is going to be time consuming and prone to error at every step. And, if you're using this for commercial purposes, the licensing could be a total dealbreaker. For some things, it's really the best option. Just make sure you weigh all the other key value store options before embarking on your BDB quest: https://en.wikipedia.org/wiki/Key-value_database