Im generating 4milions of documents and saving them with bulk processor. When I set the index.store.type to memory Im getting some weird NPEs, it goes through but in the end, there are like 2milions of documents indexed. Im trying to insert 800 docs (very small ones, few kbs) per batch, 3 threads with 1gb heap. Using the same code with index.store.type set to simplefs, inserting 3k docs/batch with 4 threads all goes smooth (ofcourse that bigger settings wont work for 'memory' also) and end result is 4mil indexed docs as expected. Are there any additional settings I should set to make it work with 'memory' setting? I have 1 node, 5 shards, 1 replica.
If you are storing in memory with only one node do you need the replica? That could end up with 2 copies on the same server.
I'd suggest you add extra nodes to scale out the load and make use of the 5 shards you are using, otherwise the shards are pointless, though you can't really add them back later.
5KB * 4 million is 20GB. You don't say how much RAM you have so it's hard to say whether all your docs would even fit into memory after all the extra indexing data is added on top.