Search code examples
algorithmhadoophashsurfknn

Any good nearest-neighbors algorithm for similar images?


I am looking for an algorithm that can search for similar images in a large collection. I'm currently using a SURF implementation in OpenCL.

At first I used the KNN search algorithm to compare every image's interrest points to the rest of the collection but tests revealed that it doesn't scale well. I've also tried a Hadoop implementation of KNN-Join which really takes a lot of temporary space in HDFS, way too much compared to the amount of input data. In fact pairwise distance approach isn't really appropriate because of the dimension of my input vectors (64).

I heard of Locally Sensitive Hashing and wondered if there was any free implementation, or if it's worth implementing it, maybe there's another algorithm I am not aware of ?


Solution

  • IIRC the flann algorithm is a good compromise: http://people.cs.ubc.ca/~mariusm/index.php/FLANN/FLANN