python curve-fitting numerical-methods data-fitting model-fitting

Fitting parameter inside an integral using python (or another useful language)

I have a set of data, basically with the information of f(x) as a function of x, and x itself. I know from the theory of the problem that I'm working on the format of f(x), which is given as the expression below:

Essentially, I want to use this set of data to find the parameters a and b. My problem is: How can I do that? What library should I use? I would like an answer using Python. But R or Julia would be ok as well.

From everything I had done so far, I've read about a functionallity called curve fit from the SciPy library but I'm having some trouble in which form I would do the code as long my x variable is located in one of the integration limit.

For better ways of working with the problem, I also have the following resources:

A sample set, for which I know the parameters I'm looking for. To this set I know that a = 2 and b = 1 (and c = 3). And before it rises some questions about how I know these parameters: I know they because I created this sample set using this parameters from the integration of the equation above just to use the sample to investigate how can I find them and have a reference.

I also have this set, for which the only information I have is that c = 4 and want to find a and b.

I would also like to point out that:

i) right now I have no code to post here because I don't have a clue how to write something to solve my problem. But I would be happy to edit and update the question after reading any answer or help that you guys could provide me.

ii) I'm looking first for a solution where I don't know a and b. But in case that it is too hard I would be happy to see some solution where I suppose that one either a or b is known.

EDIT 1: I would like to reference this question to anyone interested in this problem as it's a parallel but also important discussion to the problem faced here

Solution

I would use a pure numeric approach, which you can use even when you can not directly solve the integral. Here's a snipper for fitting only the a parameter:

import numpy as np
from scipy.optimize import curve_fit
import pandas as pd
import matplotlib.pyplot as plt

def integrand(x, a):
    b = 1
    c = 3
    return 1/(a*np.sqrt(b*(1+x)**3 + c*(1+x)**4))

def integral(x, a):
    dx = 0.001
    xx = np.arange(0, x, dx)
    arr = integrand(xx, a)
    return np.trapz(arr, dx=dx, axis=-1)

vec_integral = np.vectorize(integral)

df = pd.read_csv('data-with-known-coef-a2-b1-c3.csv')
x = df.domin.values
y = df.resultados2.values
out_mean, out_var = curve_fit(vec_integral, x, y, p0=[2])

plt.plot(x, y)
plt.plot(x, vec_integral(x, out_mean[0]))
plt.title(f'a = {out_mean[0]:.3f} +- {np.sqrt(out_var[0][0]):.3f}')
plt.show()

vec_integral = np.vectorize(integral)

Of course, you can lower the value of dx to get the desired precision. While for fitting just the a, when you try to fir b as well, the fit does not converge properly (in my opinion because a and b are strongly correlated). Here's what you get:

def integrand(x, a, b):
    c = 3
    return 1/(a*np.sqrt(np.abs(b*(1+x)**3 + c*(1+x)**4)))

def integral(x, a, b):
    dx = 0.001
    xx = np.arange(0, x, dx)
    arr = integrand(xx, a, b)
    return np.trapz(arr, dx=dx, axis=-1)

vec_integral = np.vectorize(integral)

out_mean, out_var = sp.optimize.curve_fit(vec_integral, x, y, p0=[2,3])
plt.title(f'a = {out_mean[0]:.3f} +- {np.sqrt(out_var[0][0]):.3f}\nb = {out_mean[1]:.3f} +- {np.sqrt(out_var[1][1]):.3f}')

plt.plot(x, y, alpha=0.4)
plt.plot(x, vec_integral(x, out_mean[0], out_mean[1]), color='green', label='fitted solution')
plt.plot(x, vec_integral(x, 2, 1),'--', color='red', label='theoretical solution')
plt.legend()
plt.show()

As you can see, even if the resulting a and b parameters form the fit are "not good", the plot is very similar.