Suppose you have two arrays of different lengths, such that:
a = np.linspace(1, 10, 25)
b = np.linspace(2, 10, 12)
I am interested in finding in the array a
the indexes of the values that have the smallest difference from all the values of the array b
.
An easy way to do this (it works) is doing the following:
def find_nearest(a, b):
indexes = []
for i in range (len(b)):
idx = (np.abs(a - b[i])).argmin()
indexes.append(idx)
return indexes
However, if a
and b
are quite long arrays this is not super efficient. Do you think there would be a better way to write this? I'm interested in the case that the array a
is longer than array b
.
Since your input values are sorted, the most efficient would be to use searchsorted
:
a = np.linspace(1, 10, 25)
b = np.linspace(2, 10, 12)
# find closest value such as a[i-1] < v <= a[i]
idx = np.searchsorted(a, b, side='left')
# check if the previous value is closer
out = np.where(np.abs(b-a[idx]) < np.abs(b-a[idx-1]), idx, idx-1)
Output:
array([ 3, 5, 7, 8, 10, 12, 14, 16, 18, 20, 22, 24])
Compared to broadcasting (a
has 2*N
items, b
has N
items).