Search code examples
javadictionaryberkeley-dbpersistentberkeley-db-je

Recommend a fast & scalable persistent Map - Java


I need a disk backed Map structure to use in a Java app. It must have the following criteria:

  1. Capable of storing millions of records (even billions)
  2. Fast lookup - the majority of operations on the Map will simply to see if a key already exists. This, and 1 above are the most important criteria. There should be an effective in memory caching mechanism for frequently used keys.
  3. Persistent, but does not need to be transactional, can live with some failure. i.e. happy to synch with disk periodically, and does not need to be transactional.
  4. Capable of storing simple primitive types - but I don't need to store serialised objects.
  5. It does not need to be distributed, i.e. will run all on one machine.
  6. Simple to set up & free to use.
  7. No relational queries required

Records keys will be strings or longs. As described above reads will be much more frequent than writes, and the majority of reads will simply be to check if a key exists (i.e. will not need to read the keys associated data). Each record will be updated once only and records are not deleted.

I currently use Bdb JE but am seeking other options.


Update

Have since improved query performance on my existing BDB setup by reducing the dependency on secondary keys. Some queries required a join on two secondary keys and by combining them into a composite key I removed a level of indirection in the lookup which speeds things up nicely.


Solution

  • I'd likely use a local database. Like say Bdb JE or HSQLDB. May I ask what is wrong with this approach? You must have some reason to be looking for alternatives.

    In response to comments: As the problem performance and I guess you are already using JDBC to handle this it might be worth trying HSQLB and reading the chapter on Memory and Disk Use.