Search code examples
pythonmatplotlibgraphing

Considering Highest Peak Curve From Two Sets of Data Points


I have two columns which would correspond to x and y-axis in which I will be eventually graphing that sets of data points to a curve like graph.

The problem is that based on the nature of the datapoints, when graphing it, I end up having two peaks however, I want to pick only the highest peak when graphing and discard the lowest peak(s) (not the highest point but the entire the highest peak graphed).

Is there away to do that in Python? I don't show the codes here because I am not sure how to do the coding at all.

Here is the datapoints (input) as well as the graph!

enter image description here

enter image description here


Solution

  • You can use scipy argrelextrema to get all the peaks, work out the maximum and then build up a mask array for the peak you want to plot. This will give you full control based on your data, using things like mincutoff to work out what determines a separate peak,

    import numpy as np
    from scipy.signal import argrelextrema
    import matplotlib.pyplot as plt
    
    #Setup and plot data
    fig, ax = plt.subplots(1,2)
    y = np.array([0,0,0,0,0,6.14,7.04,5.6,0,0,0,0,0,0,0,0,0,0,0,16.58,60.06,99.58,100,50,0.,0.,0.])
    x = np.linspace(3.92,161,y.size)
    ax[0].plot(x,y)
    
    #get peaks
    peaks_indx = argrelextrema(y, np.greater)[0]
    peaks = y[peaks_indx]
    ax[0].plot(x[peaks_indx],y[peaks_indx],'o')
    
    #Get maxpeak
    maxpeak = 0.
    for p in peaks_indx:
        print(p)
        if y[p] > maxpeak:
            maxpeak = y[p]
            maxpeak_indx = p
    
    #Get mask of data around maxpeak to plot
    mincutoff = 0.
    indx_to_plot = np.zeros(y.size, dtype=bool)
    for i in range(maxpeak_indx):
        if y[maxpeak_indx-i] > mincutoff:
            indx_to_plot[maxpeak_indx-i] = True
        else:
            indx_to_plot[maxpeak_indx-i] = True
            break
    
    for i in range(y.size-maxpeak_indx):
        if y[maxpeak_indx+i] > mincutoff:
            indx_to_plot[maxpeak_indx+i] = True
        else:
            indx_to_plot[maxpeak_indx+i] = True
            break
    ax[1].plot(x[indx_to_plot],y[indx_to_plot])
    plt.show()
    

    The result is then,

    enter image description here

    UPDATE: Code to plot only the largest peak.

    import numpy as np
    from scipy.signal import argrelextrema
    import matplotlib.pyplot as plt
    
    #Setup and plot data
    y = np.array([0,0,0,0,0,6.14,7.04,5.6,0,0,0,0,0,0,
                  0,0,0,0,0,16.58,60.06,99.58,100,50,0.,0.,0.])
    x = np.linspace(3.92,161,y.size)
    
    #get peaks
    peaks_indx = argrelextrema(y, np.greater)[0]
    peaks = y[peaks_indx]
    
    #Get maxpeak
    maxpeak = 0.
    for p in peaks_indx:
        print(p)
        if y[p] > maxpeak:
            maxpeak = y[p]
            maxpeak_indx = p
    
    #Get mask of data around maxpeak to plot
    mincutoff = 0.
    indx_to_plot = np.zeros(y.size, dtype=bool)
    for i in range(maxpeak_indx):
        if y[maxpeak_indx-i] > mincutoff:
            indx_to_plot[maxpeak_indx-i] = True
        else:
            indx_to_plot[maxpeak_indx-i] = True
            break
    
    for i in range(y.size-maxpeak_indx):
        if y[maxpeak_indx+i] > mincutoff:
            indx_to_plot[maxpeak_indx+i] = True
        else:
            indx_to_plot[maxpeak_indx+i] = True
            break
    
    #Plot just the highest peak
    plt.plot(x[indx_to_plot],y[indx_to_plot])
    plt.show()
    

    I would still suggest plotting both peaks to ensure the algorithm is working correctly... I think you will find that identifying an arbitrary peak is probably not always trivial with messy data.