I'm trying to index over 200 million documents using HNSW in Vespa, but it consumes significant memory as the number of files increases. My server has 64GB of memory, and I estimate storing all data in Vespa would require 750GB. Is there a way for Vespa to efficiently manage this 750GB dataset without adding more memory or servers?
Ideally, I'd like solutions that maintain search quality, avoiding reductions in vector dimensionality or HNSW parameters.
I've searched official documentation but haven't found a suitable answer. When the memory limit is reached, a feed block occurs, or everything became very slow with swap disk. Does anyone have ideas on how to handle this effectively?
The vectors must reside in memory to get reasonable performance, as they need to be randomly accessed both during write and queries. You just need to pay what it costs.
Technically, you can declare the vector attribute paged to have it swap out of RAM, but then both writing and querying will become very slow.