Search code examples
elasticsearchvectorindexingknn

Elastic KNN search num candidates. How are the candidates selected?


I'm trying to understand the "num_candidates" parameter when using KNN search in elastic search.

What I undertand reading some posts/documentation I will list at the end is that ES select "num_candidates" documents from each shard and then select "k" documents from the "num_candidates" selected documents.

My question is, how those "num_candidates" documents are selected? I found this answer but I do not understand the "how".

"num_candidates is the same idea as efSearch. Its the number of candidates we continue to keep track of while searching the HNSW graph per shard."

My concern behind this is if KNN does not evaluate all the documents in an index.

And my current issue is that executing a knn search I'm getting different results from one time to other with no changes in the index between searches.

Thank you.

Related documentation:

https://www.elastic.co/search-labs/blog/elasticsearch-knn-and-num-candidates-strategies https://www.elastic.co/search-labs/blog/simplifying-knn-search https://www.elastic.co/guide/en/elasticsearch/reference/master/knn-search.html#tune-approximate-knn-for-speed-accuracy https://discuss.elastic.co/t/elastic-knn-search-questions/344684


Solution

  • Tldr;

    Yes, you do not evaluate all document in a shard. Yes, there is a bit of randomness involved.

    Searching over all document would too expensive.

    To understand

    The HNSW (Hierarchical Navigable Small World) algorithm, will work by linking vector that are similar together. But not to all vector.

    Then at search time, the algorithm select a node at random, and compute for its neighbours only who is the closest. In the case of a neighbours being closer, it is repeating the process over and over, until you get to the max num of candidates.

    The all shards return their approximative selection, and compute a top k.

    So yes you will have randomness and 2 queries with the same parameters might not have the same results.

    The algorithm I explained is a gross simplification of the actual algorithm