I have a set of data, basically with the information of f(x) as a function of x, and x itself. I know from the theory of the problem that I'm working on the format of f(x), which is given as the expression below:
Essentially, I want to use this set of data to find the parameters a and b. My problem is: How can I do that? What library should I use? I would like an answer using Python. But R or Julia would be ok as well.
From everything I had done so far, I've read about a functionallity called curve fit from the SciPy library but I'm having some trouble in which form I would do the code as long my x variable is located in one of the integration limit.
For better ways of working with the problem, I also have the following resources:
A sample set, for which I know the parameters I'm looking for. To this set I know that a = 2 and b = 1 (and c = 3). And before it rises some questions about how I know these parameters: I know they because I created this sample set using this parameters from the integration of the equation above just to use the sample to investigate how can I find them and have a reference.
I also have this set, for which the only information I have is that c = 4 and want to find a and b.
I would also like to point out that:
i) right now I have no code to post here because I don't have a clue how to write something to solve my problem. But I would be happy to edit and update the question after reading any answer or help that you guys could provide me.
ii) I'm looking first for a solution where I don't know a and b. But in case that it is too hard I would be happy to see some solution where I suppose that one either a or b is known.
EDIT 1: I would like to reference this question to anyone interested in this problem as it's a parallel but also important discussion to the problem faced here
I would use a pure numeric approach, which you can use even when you can not directly solve the integral. Here's a snipper for fitting only the a
parameter:
import numpy as np
from scipy.optimize import curve_fit
import pandas as pd
import matplotlib.pyplot as plt
def integrand(x, a):
b = 1
c = 3
return 1/(a*np.sqrt(b*(1+x)**3 + c*(1+x)**4))
def integral(x, a):
dx = 0.001
xx = np.arange(0, x, dx)
arr = integrand(xx, a)
return np.trapz(arr, dx=dx, axis=-1)
vec_integral = np.vectorize(integral)
df = pd.read_csv('data-with-known-coef-a2-b1-c3.csv')
x = df.domin.values
y = df.resultados2.values
out_mean, out_var = curve_fit(vec_integral, x, y, p0=[2])
plt.plot(x, y)
plt.plot(x, vec_integral(x, out_mean[0]))
plt.title(f'a = {out_mean[0]:.3f} +- {np.sqrt(out_var[0][0]):.3f}')
plt.show()
vec_integral = np.vectorize(integral)
Of course, you can lower the value of dx
to get the desired precision. While for fitting just the a
, when you try to fir b
as well, the fit does not converge properly (in my opinion because a
and b
are strongly correlated). Here's what you get:
def integrand(x, a, b):
c = 3
return 1/(a*np.sqrt(np.abs(b*(1+x)**3 + c*(1+x)**4)))
def integral(x, a, b):
dx = 0.001
xx = np.arange(0, x, dx)
arr = integrand(xx, a, b)
return np.trapz(arr, dx=dx, axis=-1)
vec_integral = np.vectorize(integral)
out_mean, out_var = sp.optimize.curve_fit(vec_integral, x, y, p0=[2,3])
plt.title(f'a = {out_mean[0]:.3f} +- {np.sqrt(out_var[0][0]):.3f}\nb = {out_mean[1]:.3f} +- {np.sqrt(out_var[1][1]):.3f}')
plt.plot(x, y, alpha=0.4)
plt.plot(x, vec_integral(x, out_mean[0], out_mean[1]), color='green', label='fitted solution')
plt.plot(x, vec_integral(x, 2, 1),'--', color='red', label='theoretical solution')
plt.legend()
plt.show()
As you can see, even if the resulting a
and b
parameters form the fit are "not good", the plot is very similar.