Search code examples
matlabknnflann

FLANN in matlab returns different distance from my own calculation


I'm using FLANN in matlab and using SIFT feature descriptor as my data. There is a function:

[result, ndists] = flann_search(index, testset, ...);

Here the index is built with kd-tree. The "user manual" said result returns the nearest neighbors of the samples in testset, and ndists contains the corresponding distances between the test samples and the nearest neighbors. I used the euclidean distance and found that the distances in ndists are different from that of computed by the orignal data. And even more strange, all the numbers in ndists are integers, which is often not possible for euclidean distance. Can you help me to explain this?


Solution

  • FLANN by default returns squared euclidean distance (x12 + ... + xn2). You can change the used metric with flann_set_distance_type(type, order) (see manual).

    An example:

    from pyflann import *
    import numpy as np
    
    dataset = np.array(
        [[1., 1, 1, 2, 3],
         [10, 10, 10, 3, 2],
         [100, 100, 2, 30, 1]
         ])
    testset = np.array(
        [[1., 1, 1, 1, 1],
         [90, 90, 10, 10, 1]
         ])
    
    result, dists = FLANN().nn(
        dataset, testset, 1, algorithm="kmeans", branching=32, iterations=7, checks=16)
    

    Output:

    >>> result
    array([0, 2], dtype=int32)
    >>> dists
    array([  5., 664.])
    >>> ((testset[0] - dataset[0])**2).sum()
    5.0
    >>> ((testset[1] - dataset[2])**2).sum()
    664.0
    

    SIFT features are integers so the resulting distances are also integers in case of the squared euclidean distance.