I have two columns which would correspond to x and y-axis in which I will be eventually graphing that sets of data points to a curve like graph.
The problem is that based on the nature of the datapoints, when graphing it, I end up having two peaks however, I want to pick only the highest peak when graphing and discard the lowest peak(s) (not the highest point but the entire the highest peak graphed).
Is there away to do that in Python? I don't show the codes here because I am not sure how to do the coding at all.
Here is the datapoints (input) as well as the graph!
You can use scipy
argrelextrema to get all the peaks, work out the maximum and then build up a mask array for the peak you want to plot. This will give you full control based on your data, using things like mincutoff to work out what determines a separate peak,
import numpy as np
from scipy.signal import argrelextrema
import matplotlib.pyplot as plt
#Setup and plot data
fig, ax = plt.subplots(1,2)
y = np.array([0,0,0,0,0,6.14,7.04,5.6,0,0,0,0,0,0,0,0,0,0,0,16.58,60.06,99.58,100,50,0.,0.,0.])
x = np.linspace(3.92,161,y.size)
ax[0].plot(x,y)
#get peaks
peaks_indx = argrelextrema(y, np.greater)[0]
peaks = y[peaks_indx]
ax[0].plot(x[peaks_indx],y[peaks_indx],'o')
#Get maxpeak
maxpeak = 0.
for p in peaks_indx:
print(p)
if y[p] > maxpeak:
maxpeak = y[p]
maxpeak_indx = p
#Get mask of data around maxpeak to plot
mincutoff = 0.
indx_to_plot = np.zeros(y.size, dtype=bool)
for i in range(maxpeak_indx):
if y[maxpeak_indx-i] > mincutoff:
indx_to_plot[maxpeak_indx-i] = True
else:
indx_to_plot[maxpeak_indx-i] = True
break
for i in range(y.size-maxpeak_indx):
if y[maxpeak_indx+i] > mincutoff:
indx_to_plot[maxpeak_indx+i] = True
else:
indx_to_plot[maxpeak_indx+i] = True
break
ax[1].plot(x[indx_to_plot],y[indx_to_plot])
plt.show()
The result is then,
UPDATE: Code to plot only the largest peak.
import numpy as np
from scipy.signal import argrelextrema
import matplotlib.pyplot as plt
#Setup and plot data
y = np.array([0,0,0,0,0,6.14,7.04,5.6,0,0,0,0,0,0,
0,0,0,0,0,16.58,60.06,99.58,100,50,0.,0.,0.])
x = np.linspace(3.92,161,y.size)
#get peaks
peaks_indx = argrelextrema(y, np.greater)[0]
peaks = y[peaks_indx]
#Get maxpeak
maxpeak = 0.
for p in peaks_indx:
print(p)
if y[p] > maxpeak:
maxpeak = y[p]
maxpeak_indx = p
#Get mask of data around maxpeak to plot
mincutoff = 0.
indx_to_plot = np.zeros(y.size, dtype=bool)
for i in range(maxpeak_indx):
if y[maxpeak_indx-i] > mincutoff:
indx_to_plot[maxpeak_indx-i] = True
else:
indx_to_plot[maxpeak_indx-i] = True
break
for i in range(y.size-maxpeak_indx):
if y[maxpeak_indx+i] > mincutoff:
indx_to_plot[maxpeak_indx+i] = True
else:
indx_to_plot[maxpeak_indx+i] = True
break
#Plot just the highest peak
plt.plot(x[indx_to_plot],y[indx_to_plot])
plt.show()
I would still suggest plotting both peaks to ensure the algorithm is working correctly... I think you will find that identifying an arbitrary peak is probably not always trivial with messy data.