Search code examples
algorithmlanguage-agnosticdata-miningnearest-neighbor

What are some fast approximations of Nearest Neighbor?


Say I have a huge (a few million) list of n vectors, given a new vector, I need to find a pretty close one from the set but it doesn't need to be the closest. (Nearest Neighbor finds the closest and runs in n time)

What algorithms are there that can approximate nearest neighbor very quickly at the cost of accuracy?

EDIT: Since it will probably help, I should mention the data are pretty smooth most of the time, with a small chance of spikiness in a random dimension.


Solution

  • There are exist faster algoritms then O(n) to search closest element by arbitary distance. Check http://en.wikipedia.org/wiki/Kd-tree for details.