I have a function: f(theta) = a+b*cos(theta - c) as well as sampled data. I'd like to find the coefficients a, b, and c that minimize mean square error. Any idea if there's an efficient way to do this in python?
EDIT:
import numpy as np
from scipy.optimize import curve_fit
#definition of the function
def myfunc(x, a, b, c):
return a + b * np.cos(x - c)
#sample data
x_data = [0, 60, 120, 180, 240, 300]
y_data = [25, 40, 70, 30, 10, 15]
#the actual curve fitting procedure, a, b, c are stored in popt
popt, _pcov = curve_fit(myfunc, x_data, y_data)
print(popt)
print(np.degrees(popt[2]))
#the rest is just a graphic representation of the data points and the fitted curve
from matplotlib import pyplot as plt
#x_fit = np.linspace(-1, 6, 1000)
y_fit = myfunc(x_data, *popt)
plt.plot(x_data, y_data, "ro")
plt.plot(x_data, y_fit, "b")
plt.xlabel(r'$\theta$ (degrees)');
plt.ylabel(r'$f(\theta)$');
plt.legend()
plt.show()
Here is a picture showing how the curve doesn't really fit the points. It seems like the amplitude should be higher. The local mins and maxes appear to be in the right places.
scipy.optimize.curve_fit makes it really easy to fit data points to your custom function:
import numpy as np
from scipy.optimize import curve_fit
#definition of the function
def myfunc(x, a, b, c):
return a + b * np.cos(x - c)
#sample data
x_data = np.arange(5)
y_data = 2.34 + 1.23 * np.cos(x_data + .23)
#the actual curve fitting procedure, a, b, c are stored in popt
popt, _pcov = curve_fit(myfunc, x_data, y_data)
print(popt)
#the rest is just a graphic representation of the data points and the fitted curve
from matplotlib import pyplot as plt
x_fit = np.linspace(-1, 6, 1000)
y_fit = myfunc(x_fit, *popt)
plt.plot(x_data, y_data, "ro", label = "data points")
plt.plot(x_fit, y_fit, "b", label = "fitted curve\na = {}\nb = {}\nc = {}".format(*popt))
plt.legend()
plt.show()
Output:
[ 2.34 1.23 -0.23]
Edit:
Your question update introduces several problems. First, your x-values are in degree, while np.cos
expects values in radians. Therefore, we better convert the values with np.deg2rad
. The reverse function would be np.rad2deg
.
Second, it is a good idea to fit for different frequencies as well, let's introduce an additional parameter for that.
Third, fits are usually quite sensitive to initial guesses. You can provide a parameter p0
in scipy
for that.
Fourth, you changed the resolution of the fitted curve to the low resolution of your data points, hence it looks so undersampled. If we address all these problems:
import numpy as np
from scipy.optimize import curve_fit
#sample data
x_data = [0, 60, 120, 180, 240, 300]
y_data = [25, 40, 70, 30, 10, 15]
#definition of the function with additional frequency value d
def myfunc(x, a, b, c, d):
return a + b * np.cos(d * np.deg2rad(x) - c)
#initial guess of parameters a, b, c, d
p_initial = [np.average(y_data), np.average(y_data), 0, 1]
#the actual curve fitting procedure, a, b, c, d are stored in popt
popt, _pcov = curve_fit(myfunc, x_data, y_data, p0 = p_initial)
print(popt)
#we have to convert the phase shift back into degrees
print(np.rad2deg(popt[2]))
#graphic representation of the data points and the fitted curve
from matplotlib import pyplot as plt
#define x_values for a smooth curve representation
x_fit = np.linspace(np.min(x_data), np.max(x_data), 1000)
y_fit = myfunc(x_fit, *popt)
plt.plot(x_data, y_data, "ro", label = "data")
plt.plot(x_fit, y_fit, "b", label = "fit")
plt.xlabel(r'$\theta$ (degrees)');
plt.ylabel(r'$f(\theta)$');
plt.legend()
plt.show()
we get this output:
[34.31293761 26.92479369 2.20852009 1.18144319]
126.53888003953764