Search code examples
pythonspline

Find inflection point : Connect gradient spline to original data


My final goal is to determine the inflection points from these two dominant peaks. Therefore I want to fit a spline to the data and find the inflection point somehow afterwards.

spline fit

t, c, k = interpolate.splrep(df_sensors_100ppm["Measurement_no"], np.gradient(df_sensors_100ppm["101"]), 
                             s=len(df_sensors_100ppm["Measurement_no"]), k=3)

N = 500
xmin, xmax = df_sensors_100ppm["Measurement_no"].min(), df_sensors_100ppm["Measurement_no"].max()
xx = np.linspace(xmin, xmax, N)
spline = interpolate.BSpline(t, c, k, extrapolate=False)

plt.plot(df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"], 'bo', label='Original points')
plt.plot(df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"], '-', label='', alpha = 0.3)
plt.plot(xx, spline(xx), 'r', label='BSpline')
plt.grid()
plt.legend(loc='best')
plt.show()

max_idx = np.argmax(spline(xx))
> 336

My problem is that I don't know what this number 336 represents. I thought it would be datapoint at which the gradient is the highest. But there are only 61 data points. How can I connect the gradient spline with my data points to find the data point I am looking for? The issue that the inflection point doesn't fall in a data point is not important, so I am happy with a data point next to it. I think I also do not need the exact numbering of the data point (on the x-axis above the range is from 6830 to ~6890). So either this numbering or just the numbering of the data points starting at zero. I appreciate any help!

df_sensors_100ppm
Measurement_no 101
6833    1081145.8
6834    1071195.6
6835    1061668.0
6836    841877.0
6837    227797.5
6838    154449.2
6839    130070.3
6840    119169.5
6841    113275.4
6842    92762.5
6843    103557.7
6844    324869.6
6845    318933.3
6846    275562.4
6847    243599.4
6848    220276.8
6849    203228.2
6850    189876.8
6851    178849.3
6852    169680.8
6853    162223.4
6854    156308.3
6855    151195.9
6856    147203.1
6857    143907.5
6858    141076.7
6859    138626.1
6860    136471.3
6861    134422.2
6862    132542.0
6863    130661.8
6864    128845.0
6865    126880.3
6866    125084.6
6867    123162.2
6868    121282.0
6869    119275.1
6870    117352.7
6871    115219.0
6872    113402.2
6873    111353.0
6874    94959.5
6875    102269.0
6876    327911.7
6877    318193.9
6878    273175.2
6879    241212.2
6880    218354.3
6881    201073.4
6882    187806.5
6883    176821.2
6884    167864.0
6885    160406.6
6886    154385.8
6887    149653.7
6888    145851.1
6889    142534.4
6890    139893.7
6891    137464.2
6892    135246.0
6893    133239.1
6894    131422.3
6895    129499.9
6896    127577.5

Solution

  • You don't need to construct the gradient of the data, you can pass the spline to the data and use the derivative method. I personally prefer InterpolatedUnivariateSpline for data that does not need smoothing (as it passes through all points):

    x, y = df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"]
    
    from scipy.interpolate import splprep, BSpline, InterpolatedUnivariateSpline as IUS
    
    spline = IUS(x, y)
    N=500
    xx = np.linspace(x.min(), x.max(), N)
    
    import matplotlib.pyplot as plt
    plt.plot(x, y, 'go')
    plt.plot(xx, spline(xx))
    plt.plot(xx, spline.derivative()(xx))
    
    # np.argsort will give the positions of the sorted array from min to max, in your case you want the latter two 
    
    x[np.argsort(spline.derivative()(x))[-2:]]
    >>array([6843., 6875.])