Search code examples
pythonpython-3.xpandasscipysignal-processing

How to programmatically identify the first and second minima and the peak in a bell-shaped curve?


I'm working with a dataset that forms a bell-shaped (normal-like) distribution. I'm trying to find three specific points in this distribution:

  1. The first minimum point before the curve ascends.
  2. The peak of the curve (the maximum point).
  3. The second minimum as the curve descends after the peak.

The challenge is that the "minimum" points in the tails of my curve are not well-defined (the data flattens out), so it's tricky to identify these points precisely. I understand that I might need to find where the curve starts to rise (where the first derivative changes from negative to positive) for the first minimum, and where it starts to fall (where the first derivative changes from positive to negative) for the second minimum, after the peak.

The following is the graph, and the data is: enter image description here

Here's a simplified structure of my data after loading it with pandas:

# ... (data loading code) ...
print(df.head())

The y-axis values:

0    0.000006775275118
1    0.000002841071152
2    0.000002331050869
3    0.000002098089639
4    0.000001958793763
5    0.000001882957831
6    0.000001817511261
7    0.000001778793930
8    0.000001747600657
9    0.000001726581760
10   0.000001736836910
11   0.000001725393807
12   0.000001735801905
13   0.000001722637070
14   0.000001749210289
15   0.000001743336865
16   0.000001773540895
17   0.000001758737558
18   0.000001792945553
19   0.000001789850672
20   0.000001779160328
21   0.000001807576901
22   0.000001808267621
23   0.000001818196607
24   0.000001811775275
25   0.000001818907290
26   0.000001807848091
27   0.000001836718285
28   0.000001808366208
29   0.000001808187769
30   0.000001782767490
31   0.000001769246699
32   0.000001775707035
33   0.000001759920903
34   0.000001737253676
35   0.000001722037872
36   0.000001727249139
37   0.000001693093662
38   0.000001701267438
39   0.000001692311112
40   0.000001678170239
41   0.000001661488536
42   0.000001668086770
43   0.000001667761220
44   0.000001662043200
45   0.000001667680139
46   0.000001659051206
47   0.000001708371198
48   0.000001732222077
49   0.000001774399919
50   0.000001876523600
51   0.000002025685347
52   0.000002259535699
53   0.000002560415994
54   0.000003055340098
55   0.000003727916538
56   0.000004705124476
57   0.000005971950809
58   0.000007664882924
59   0.000009665827809
60   0.000012083860418
61   0.000014769510653
62   0.000017550004674
63   0.000020119588986
64   0.000022386885842
65   0.000024171012583
66   0.000025206126640
67   0.000025491871789
68   0.000024878712706
69   0.000023424992853
70   0.000021276252458
71   0.000018607410922
72   0.000015824313725
73   0.000012923828210
74   0.000010311275904
75   0.000008025889954
76   0.000006292151302
77   0.000004904108668
78   0.000003974381668
79   0.000003333372577
80   0.000002833383398
81   0.000002537387898
82   0.000002308652989
83   0.000002216008051
84   0.000002145439742
85   0.000002146526344
86   0.000002167240574
87   0.000002248661389
88   0.000002323548464
89   0.000002430060014
90   0.000002537689493
91   0.000002347846822
Name:  diff_current, dtype: float64

I'm currently using the find_peaks function from SciPy to locate the peaks and the minima (by inverting the y values) following https://stackoverflow.com/a/56812929/10543310. But I am still not abl to get only the values of first min, peak, and second min.

Could anyone guide me on how to identify these three points? Any help with the logic or code would be greatly appreciated.


Solution

  • Just do slicing based on the location of the maximum:

    peak = data.argmax()
    print(f'Maximum: {data[peak]} at {peak}')
    print(f'Left minimum: {data[:peak].min()}')
    print(f'Right minimum: {data[peak:].min()}')          
    
    Maximum: 2.5491871789e-05 at 67
    Left minimum: 1.659051206e-06
    Right minimum: 2.145439742e-06