Search code examples
matlabjuliaknnnearest-neighbor

knnsearch from Matlab to Julia


I am trying to run a nearest neighbour search in Julia using NearestNeighbors.jl package. The corresponding Matlab code is

X = rand(10);
Y = rand(100); 
Z = zeros(size(Y));
Z = knnsearch(X, Y); 

This generates Z, a vector of length 100, where the i-th element is the index of X whose element is nearest to the i-th element in Y, for all i=1:100.

Could really use some help converting the last line of the Matlab code above to Julia!


Solution

  • Use:

    X = rand(1, 10)
    Y = rand(1, 100)
    nn(KDTree(X), Y)[1]
    

    The storing the intermediate KDTree object would be useful if you wanted to reuse it in the future (as it will improve the efficiency of queries).

    Now what is the crucial point of my example. The NearestNeighbors.jl accepst the following input data:

    It can either be:

    • a matrix of size nd × np with the points to insert in the tree where nd is the dimensionality of the points and np is the number of points
    • a vector of vectors with fixed dimensionality, nd, which must be part of the type.

    I have used the first approach. The point is that observations must be in columns (not in rows as in your original code). Remember that in Julia vectors are columnar, so rand(10) is considered to be 1 observation that has 10 dimensions by NearestNeighbors.jl, while rand(1, 10) is considered to be 10 observations with 1 dimension each.

    However, for your original data since you want a nearest neighbor only and it is single-dimensional and is small it is enough to write (here I assume X and Y are original data you have stored in vectors):

    [argmin(abs(v - y) for v in X) for y in Y]
    

    without using any extra packages.

    The NearestNeighbors.jl is very efficient for working with high-dimensional data that has very many elements.