pythonnumpy

How to find the threshold that will yield the desired number of array elements

Given an array of numbers and a target count, I want to find the threshold such that the number of element that are above it will be equal the target (or as close as possible).

For example.

``````arr = np.random.rand(100)
target = 80
for i in range(100):
t = i * 0.01
if (arr > t).sum() < target: break
print(t)
``````

However this is not efficient and it is not very precise, and perhaps someone has already solved this problem.

EDIT:

In the end I found `scipy.optimize.bisect` (link) which works perfectly.

Solution

• I am using `scipy.optimize.bisect`

``````import scipy.optimize
import numpy as np

def compute_cutoff(arr:np.ndarray, volume:float) -> float:
"""
Compute the cutoff to attain desired volume.
Returns the cutoff, such that (arr > cutoff).sum() == volume
or as close as possible.
:param np.ndarray arr: an array with values (0 .. 1)
:param float volume: desired volume in number of voxels
"""
tolerance = max(1,volume*.01)
gross_diff = lambda y: y - volume if abs(y - volume) > tolerance else 0
err_fn = lambda x : gross_diff((arr > x).sum())
# probably this [-2,2] could be tighter [0,1], but just to be safe.
return scipy.optimize.bisect(err_fn, -2, 2, maxiter=50, disp=False)

arr = np.random.rand(100)
t = compute_cutoff(arr, 80)
print(t)
``````

This prints a value close to 0.2