Search code examples
pythonnumpyscipysignal-processingpeak-detection

peak integration using trapz but not starting from baseline and visualization


I am trying to integrate signal peaks using numpy.trapz or scipy.trapz. I do manually select the range of a peak min and max. the problem I am facing is that I think these functions integerates the AUC till the baseline. Sometimes my signal start or end is above the baseline. I am open to use a baseline correction method, but I need to visualize the integrated area without any correction to make sure they are correct.

image source: https://terpconnect.umd.edu/~toh/spectrum/Integration.html

Reference Here

An Example to generate data

import numpy
import peakutils
from peakutils.plot import plot as pplot
from matplotlib import pyplot


centers = (30.5, 72.3)
x = numpy.linspace(0, 120, 121)
y = (peakutils.gaussian(x, 5, centers[0], 3) +
    peakutils.gaussian(x, 7, centers[1], 10) +
    numpy.random.rand(x.size))
pyplot.figure(figsize=(10,6))
pyplot.plot(x, y)
pyplot.title("Data with noise")

enter image description here


Solution

  • To get the area under the peak you can interpolate the curve using scipy.interpolate.make_interp_spline which has an integrate method. This will be the entire area under the curve between x=a and x=b, so you will then want to subtract the area of the trapezoid created by connecting points at x=a and x=b and the x-axis. In the code I share below I assume f(a) >= 0 and f(b) >= 0.

    Note: I added another peak and reduced the noise level to make the endpoints more prominent/obvious.

    import numpy as np
    import peakutils
    import matplotlib.pyplot as plt
    from scipy.signal import find_peaks
    from scipy.interpolate import make_interp_spline
    
    plt.close("all")
    
    rng = np.random.default_rng(42)
    
    centers = (30.5, 72.3, 112.1)
    x = np.arange(0, 120)
    y = (peakutils.gaussian(x, 5, centers[0], 3)
         + peakutils.gaussian(x, 7, centers[1], 10)
         + peakutils.gaussian(x, 6, centers[2], 5)
         + rng.random(x.size)*0.2)
         
    peak_indices, _ = find_peaks(-y, prominence=4)
    
    fig, ax = plt.subplots()
    ax.plot(x, y, label="Data")
    ax.plot(x[peak_indices], y[peak_indices], "x", color="r", label="End points")
    # ax.title("Data with noise")
    
    
    x1, x2 = x[peak_indices]
    
    interp = make_interp_spline(x, y, k=1)
    m = (interp(x2) - interp(x1))/(x2 - x1)
    x = np.linspace(x1, x2, 200)
    ax.fill_between(x, interp(x), m*(x - x1) + interp(x1), alpha=0.5, 
                    label="Desired area")
    ax.fill_between(x, m*(x - x1) + interp(x1), alpha=0.5, 
                    label="Trapezoid area")
    
    ax.legend()
    
    # Integral under the peak minus the area of the trapezoid connecting the two
    # ends of the peak.
    # NOTE: EDGE CASE OF PEAK ENDS GOING BELOW THE X-AXIS IS NOT BEING CONSIDERED
    integral = interp.integrate(x1, x2) - 0.5*(x2-x1)*(interp(x1) + interp(x2))
    print(integral)  # 163.8743118056535