values = [ 8.42, 8.87, 8.88, 8.88, 8.88, 8.58, 8.58,
8.58, 8.58, 8.58, 8.58, 8.58, 8.58, 8.58, 0. , 8.58,
17.65, 17.65, 17.65, 17.65, 17.65, 17.65, 17.65, 17.65, 17.65,
17.65, 17.65, 17.65, 17.9 , 0. , 17.9 , 17.9 , 17.68, 17.68,
17.68, 17.68, 17.68, 17.68, 17.68, 17.68, 17.68, 17.68, 17.68,
8.89, 8.89, 9.86, 8. , 8.89, 8.89, 8.89, 8.93, 8.95,
]
data = pd.Series(values)
data.plot.kde()
I have a list of values, and I can easily generate a kernel density plot which shows there are modal peaks at about 8 and 17.
I know that matplotlib is using scipy.stats.gaussian_kde
to generate the curve, and that with the curve I should be able to use scipy.signal.find_peaks
to find the stationary peaks... but I can't quite get anything working.
How do I extract the modal peaks from a 1-d data vector?
This does the trick:
def get_modal_points(data, precision=0.1):
data = data.loc[~pd.isna(data)].copy()
r = np.arange(data.min(), data.max(), precision)
kernel = gaussian_kde(data)
curve = kernel(r)
peaks, _ = find_peaks(curve)
modal_points = r[peaks]
return modal_points
Apparently the modal_points
are [ 0.1, 8.7, 17.7]
, which looks about right.