Search code examples
pythonscipycurve-fittingminimizemse

Specify a function with unknown coefficients as well as data and find coefficients in python


I have a function: f(theta) = a+b*cos(theta - c) as well as sampled data. I'd like to find the coefficients a, b, and c that minimize mean square error. Any idea if there's an efficient way to do this in python?

EDIT:

import numpy as np
from scipy.optimize import curve_fit

#definition of the function 
def myfunc(x, a, b, c):
    return a + b * np.cos(x - c)

#sample data
x_data = [0, 60, 120, 180, 240, 300]
y_data = [25, 40, 70, 30, 10, 15]

#the actual curve fitting procedure, a, b, c are stored in popt
popt, _pcov = curve_fit(myfunc, x_data, y_data)
print(popt)
print(np.degrees(popt[2]))

#the rest is just a graphic representation of the data points and the fitted curve
from matplotlib import pyplot as plt

#x_fit = np.linspace(-1, 6, 1000)
y_fit = myfunc(x_data, *popt)

plt.plot(x_data, y_data, "ro")
plt.plot(x_data, y_fit, "b")
plt.xlabel(r'$\theta$ (degrees)');
plt.ylabel(r'$f(\theta)$');

plt.legend()
plt.show()

Here is a picture showing how the curve doesn't really fit the points. It seems like the amplitude should be higher. The local mins and maxes appear to be in the right places.

enter image description here


Solution

  • scipy.optimize.curve_fit makes it really easy to fit data points to your custom function:

    import numpy as np
    from scipy.optimize import curve_fit
    
    #definition of the function 
    def myfunc(x, a, b, c):
        return a + b * np.cos(x - c)
    
    #sample data
    x_data = np.arange(5)
    y_data = 2.34 + 1.23 * np.cos(x_data + .23)
    
    #the actual curve fitting procedure, a, b, c are stored in popt
    popt, _pcov = curve_fit(myfunc, x_data, y_data)
    print(popt)
    
    #the rest is just a graphic representation of the data points and the fitted curve
    from matplotlib import pyplot as plt
    
    x_fit = np.linspace(-1, 6, 1000)
    y_fit = myfunc(x_fit, *popt)
    
    plt.plot(x_data, y_data, "ro", label = "data points")
    plt.plot(x_fit, y_fit, "b", label = "fitted curve\na = {}\nb = {}\nc = {}".format(*popt))
    
    plt.legend()
    plt.show()
    

    Output:

    [ 2.34  1.23 -0.23]
    

    Edit:

    Your question update introduces several problems. First, your x-values are in degree, while np.cos expects values in radians. Therefore, we better convert the values with np.deg2rad. The reverse function would be np.rad2deg.
    Second, it is a good idea to fit for different frequencies as well, let's introduce an additional parameter for that.
    Third, fits are usually quite sensitive to initial guesses. You can provide a parameter p0 in scipy for that.
    Fourth, you changed the resolution of the fitted curve to the low resolution of your data points, hence it looks so undersampled. If we address all these problems:

    import numpy as np
    from scipy.optimize import curve_fit
    
    #sample data
    x_data = [0, 60, 120, 180, 240, 300]
    y_data = [25, 40, 70, 30, 10, 15]
    
    #definition of the function with additional frequency value d
    def myfunc(x, a, b, c, d):
        return a + b * np.cos(d * np.deg2rad(x) - c)
    #initial guess of parameters a, b, c, d
    p_initial = [np.average(y_data), np.average(y_data), 0, 1]
    
    #the actual curve fitting procedure, a, b, c, d are stored in popt
    popt, _pcov = curve_fit(myfunc, x_data, y_data, p0 = p_initial)
    print(popt)
    #we have to convert the phase shift back into degrees
    print(np.rad2deg(popt[2]))
    
    #graphic representation of the data points and the fitted curve
    from matplotlib import pyplot as plt
    #define x_values for a smooth curve representation
    x_fit = np.linspace(np.min(x_data), np.max(x_data), 1000)
    y_fit = myfunc(x_fit, *popt)
    
    plt.plot(x_data, y_data, "ro", label = "data")
    plt.plot(x_fit, y_fit, "b", label = "fit")
    plt.xlabel(r'$\theta$ (degrees)');
    plt.ylabel(r'$f(\theta)$');
    
    plt.legend()
    plt.show()
    

    we get this output:

    [34.31293761 26.92479369  2.20852009  1.18144319]
    126.53888003953764
    

    enter image description here