Search code examples
python-3.xcomputational-geometrynearest-neighborlocality-sensitive-hashapproximate-nn-searching

LSH implementation in python 3 with Euclidean distance and seeing all neighbors in LSHForest


I am looking for an efficient implementation of LSH in python 3 that uses Euclidean distance.

There is the "in-python" LSHForest implementation, but it uses cosine distances.

Also, even using this implementation, I didn't find a way to see the content of each of the baskets, e.g., if using LSH for clustering - it only returns a certain number of approximate neighbors within a certain radius. But if I want to see all neighbors, I don't see how it can be done (I do not want to use an arbitrary radius of search and I am really not sure what is the meaning of a very large or infinite radius using this implementation).

Will appreciate any insight. Many thanks.


Solution

  • For software recommendations, please ask here: Software Recommendations.


    For how this works, first read my answer and then assume that you ask from the package (I haven't used it) a big k (k should be the number of Neighbors that the software returns), within a big radius r. That should return many neighbors, set k = N, where N is the number of the points in your dataset and you will get all the neighbors.

    If you want to see all the neighbors within a certain bucket, then you have to investigate how many points can a bucket contain and set k to that number.