What do I have?
I have df pd.DataFrame
with 'power' column and 'timestamp' column.
I have also a power value who called 'x_power'
What do I want?
I am trying to find out how much energy produced above and below 'x_power' (for example x_power = 900), for doing that I think to do an integral of:
y axis - power
x axis - timestamp
pink area - energy produced above x_power
green area - energy produced below x_power
from scipy.interpolate import InterpolatedUnivariateSpline
x = df['timestamp'].to_numpy()
y = df['power'].to_numpy()
max = np.max(x)
min = np.min(x)
f = InterpolatedUnivariateSpline(x, y, k=1) # k=1 gives linear interpolation
f.integral(min , max)
output is the area below the graph.
There is an easy way to calculate above and below 'x_power' without multiple integration?
To integrate the area of our plot over x_power you need to "move down" your y values in such a way that the "new 0" is at the x_power level.
Then you should clip negative values to zero.
But because you have only selected points of the whole plot, the first step should be to generate the interpolated version of your power production line, e.g. with step of 1 and only then perform 2 above steps.
The code to do it is:
intStep = 1 # Interpolation step
# Interpolated x and y
xInt = np.arange(min, max + 1, intStep)
yInt = (np.interp(xInt, x, y) - x_power).clip(min=0)
To see this line, you can run:
fig, ax = plt.subplots()
ax.grid(True)
plt.plot(xInt, yInt)
plt.show()
And to integrate this function, run your code, but on the above source data:
f = InterpolatedUnivariateSpline(xInt, yInt, k=1)
result = f.integral(min, max)