Search code examples
algorithmsearchtime-complexitynearest-neighborquantifiers

approximate nearest neighbors time complexity


I'm reading this paper Product quantization for nearest neighbor search.

On the last row of table II page 5 it gives

the complexity given in this table for searching the k smallest elements is the average complexity for n >> k and when the elements are arbitrarily ordered enter image description here

which is n+klogkloglogn.

I guess we can use linear selection algorithm to get unsorted k nearest neighbors with O(n), and sort the k nearest neighbors with O(klogk), so can we have O(n+klogk) in total. But where does the loglogn term come from?

The paper gives a reference to the TAOCP book for this, but I don't have the book at hand, could anyone explain it for me?


Solution

  • First, Table II reports the complexity of each of the step, therefore you have to add all the terms to measure the complexity of ADC.

    In the last line of the table, it is a single complexity both for SDC and ADC, which is:

    n + k log k log log n

    The term corresponds to the average algorithmic cost of the selection algorithm that we employ to find the k smallest values in a set of n variables, that we have copy/pasted from Donald Knuth book [25].

    I don't have the book in hands to I can not check, but it sounds right.


    From the authors