Search code examples
pythonkernel-density

How do I extract the modal peaks from a 1-d data vector?


values = [ 8.42,  8.87,  8.88,  8.88,  8.88,  8.58,  8.58,
        8.58,  8.58,  8.58,  8.58,  8.58,  8.58,  8.58,  0.  ,  8.58,
       17.65, 17.65, 17.65, 17.65, 17.65, 17.65, 17.65, 17.65, 17.65,
       17.65, 17.65, 17.65, 17.9 ,  0.  , 17.9 , 17.9 , 17.68, 17.68,
       17.68, 17.68, 17.68, 17.68, 17.68, 17.68, 17.68, 17.68, 17.68,
        8.89,  8.89,  9.86,  8.  ,  8.89,  8.89,  8.89,  8.93,  8.95,
]
data = pd.Series(values)
data.plot.kde()

KDE Plot showing peaks

I have a list of values, and I can easily generate a kernel density plot which shows there are modal peaks at about 8 and 17.

I know that matplotlib is using scipy.stats.gaussian_kde to generate the curve, and that with the curve I should be able to use scipy.signal.find_peaks to find the stationary peaks... but I can't quite get anything working.

How do I extract the modal peaks from a 1-d data vector?


Solution

  • This does the trick:

    def get_modal_points(data, precision=0.1):
      data = data.loc[~pd.isna(data)].copy()
      r = np.arange(data.min(), data.max(), precision)
      kernel = gaussian_kde(data)
      curve = kernel(r)
      peaks, _ = find_peaks(curve)
      modal_points = r[peaks]
      return modal_points
    

    Apparently the modal_points are [ 0.1, 8.7, 17.7], which looks about right.