Search code examples
javamutableb-treemultimappersistent-storage

Indexing <String, Arraylist<Integer>> using B-Tree


I am about to index 10 million titles with their IDs(for now their line numbers), titles will be stored after tokenising them. The structure of the data has to be something like <String, Arraylist<Integer>>. Strings will represent the tokens, Integers will represent line numbers.

I have to build this tool using: Java, persistent memory, not using RDBMS as possible. As this data structure is mutable, I couldn't find any tools that support MultiMaps with the structure > to be indexed using BTree or any other persistent data structures.

I tried MapDB, but turned to only accept immutable, which in my case doesn't apply (Arraylist)

Any thoughts are appreciated.


Solution

  • What you need is called MultiMap. MapDB does not support those directly, but has composite sets which are almost as good.

    Example is here: https://github.com/jankotek/MapDB/blob/release-1.0/src/test/java/examples/MultiMap.java