Hi I have a very long numpy vector (1000000+) maxy1
filled with integers numbers. The vector is sorted ascending. now I also have another integer val
. Nnow I want to find the highest number in maxy1
below val
what I do is:
idx = maxy1[np.where(maxy1 < val)[0][-1]]
but is too slow, I have to repeat this operation for 10000 times and it's taking 94% of the running time. any idea to make it faster?
For this sort of task I suggest you use np.searchsorted:
import numpy as np
# setup
maxy1 = np.array([4, 5, 6, 7, 8, 9])
val = 7
# does a binary search
res = np.searchsorted(maxy1, val)
print(maxy1[res - 1])
Output
6
From the documentation:
Binary search is used to find the required insertion points.
A binary search has a worst case performance of O(logN) (see Wikipedia entry) as opposed to your current approach that is linear O(N).