Search code examples
arraysnumpyoptimization

Finding nearest value in numpy array given another array of values


Suppose you have two arrays of different lengths, such that:

a = np.linspace(1, 10, 25)
b = np.linspace(2, 10, 12)

I am interested in finding in the array a the indexes of the values that have the smallest difference from all the values of the array b.

An easy way to do this (it works) is doing the following:

def find_nearest(a, b):
    indexes = []
    for i in range (len(b)):
        idx = (np.abs(a - b[i])).argmin()
        indexes.append(idx)
    return indexes

However, if a and b are quite long arrays this is not super efficient. Do you think there would be a better way to write this? I'm interested in the case that the array a is longer than array b.


Solution

  • Since your input values are sorted, the most efficient would be to use searchsorted:

    a = np.linspace(1, 10, 25)
    b = np.linspace(2, 10, 12)
    
    # find closest value such as a[i-1] < v <= a[i]
    idx = np.searchsorted(a, b, side='left')
    
    # check if the previous value is closer
    out = np.where(np.abs(b-a[idx]) < np.abs(b-a[idx-1]), idx, idx-1)
    

    Output:

    array([ 3,  5,  7,  8, 10, 12, 14, 16, 18, 20, 22, 24])
    

    timings

    Compared to broadcasting (a has 2*N items, b has N items).

    enter image description here