My final goal is to determine the inflection points from these two dominant peaks. Therefore I want to fit a spline to the data and find the inflection point somehow afterwards.
t, c, k = interpolate.splrep(df_sensors_100ppm["Measurement_no"], np.gradient(df_sensors_100ppm["101"]),
s=len(df_sensors_100ppm["Measurement_no"]), k=3)
N = 500
xmin, xmax = df_sensors_100ppm["Measurement_no"].min(), df_sensors_100ppm["Measurement_no"].max()
xx = np.linspace(xmin, xmax, N)
spline = interpolate.BSpline(t, c, k, extrapolate=False)
plt.plot(df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"], 'bo', label='Original points')
plt.plot(df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"], '-', label='', alpha = 0.3)
plt.plot(xx, spline(xx), 'r', label='BSpline')
plt.grid()
plt.legend(loc='best')
plt.show()
max_idx = np.argmax(spline(xx))
> 336
My problem is that I don't know what this number 336
represents. I thought it would be datapoint at which the gradient is the highest. But there are only 61
data points.
How can I connect the gradient spline with my data points to find the data point I am looking for?
The issue that the inflection point doesn't fall in a data point is not important, so I am happy with a data point next to it.
I think I also do not need the exact numbering of the data point (on the x-axis above the range is from 6830
to ~6890
). So either this numbering or just the numbering of the data points starting at zero.
I appreciate any help!
df_sensors_100ppm
Measurement_no 101
6833 1081145.8
6834 1071195.6
6835 1061668.0
6836 841877.0
6837 227797.5
6838 154449.2
6839 130070.3
6840 119169.5
6841 113275.4
6842 92762.5
6843 103557.7
6844 324869.6
6845 318933.3
6846 275562.4
6847 243599.4
6848 220276.8
6849 203228.2
6850 189876.8
6851 178849.3
6852 169680.8
6853 162223.4
6854 156308.3
6855 151195.9
6856 147203.1
6857 143907.5
6858 141076.7
6859 138626.1
6860 136471.3
6861 134422.2
6862 132542.0
6863 130661.8
6864 128845.0
6865 126880.3
6866 125084.6
6867 123162.2
6868 121282.0
6869 119275.1
6870 117352.7
6871 115219.0
6872 113402.2
6873 111353.0
6874 94959.5
6875 102269.0
6876 327911.7
6877 318193.9
6878 273175.2
6879 241212.2
6880 218354.3
6881 201073.4
6882 187806.5
6883 176821.2
6884 167864.0
6885 160406.6
6886 154385.8
6887 149653.7
6888 145851.1
6889 142534.4
6890 139893.7
6891 137464.2
6892 135246.0
6893 133239.1
6894 131422.3
6895 129499.9
6896 127577.5
You don't need to construct the gradient of the data, you can pass the spline to the data and use the derivative
method. I personally prefer InterpolatedUnivariateSpline
for data that does not need smoothing (as it passes through all points):
x, y = df_sensors_100ppm["Measurement_no"], df_sensors_100ppm["101"]
from scipy.interpolate import splprep, BSpline, InterpolatedUnivariateSpline as IUS
spline = IUS(x, y)
N=500
xx = np.linspace(x.min(), x.max(), N)
import matplotlib.pyplot as plt
plt.plot(x, y, 'go')
plt.plot(xx, spline(xx))
plt.plot(xx, spline.derivative()(xx))
# np.argsort will give the positions of the sorted array from min to max, in your case you want the latter two
x[np.argsort(spline.derivative()(x))[-2:]]
>>array([6843., 6875.])