Search code examples
pythonscipycurve-fitting

scipy optimize.curve_fit not performing linear fit


I have been trying to use the curve_fit to fit my data with a linear fitting function that I provide. Here's a simplified version:

from scipy.optimize import curve_fit
import numpy as np


def line(x,m,c):
    return x*m + c

x = [10.55977951, 11.45686089, 12.35423473, 13.30408861, 14.3528772,  15.45035217,
     16.64817782, 17.99661252, 19.49184541, 21.18832373, 23.08427703, 25.27645953,
     27.91555605, 31.14343828, 35.16744841, 40.31943045, 47.22644384, 57.23413423, 73.51606363]

y = [ 4.50873148e-04, 2.29554869e-04, 4.62602769e-04, 1.57525181e-04,
     1.16334543e-04, 9.42105291e-05, 2.86606379e-04, 2.40194287e-04,
     4.74193673e-04, 6.83848270e-04, 6.73286482e-04, 2.03506304e-04,
     3.58126867e-04, 1.88155439e-04, 5.14133854e-04, 2.39990293e-04,
     -3.60391884e-05, -1.17866329e-04, 2.68954649e-05]

y_err = [8.23397676e-05, 7.54222285e-05, 7.05355053e-05, 6.30493368e-05,
         5.73241555e-05, 5.56298862e-05, 5.00181328e-05, 4.76554758e-05,
         4.45081313e-05, 4.23716792e-05, 4.10516842e-05, 3.87066834e-05,
         3.67639901e-05, 3.51162489e-05, 3.39993704e-05, 3.29275562e-05,
         3.24743967e-05, 3.15296789e-05, 3.09144126e-05]

x = np.array(x)
y = np.array(y)
y_err = np.array(y_err)


popt, cov = curve_fit(line, x, y, sigma=y_err)
m = popt[0]
c = popt[1]
print(m,c, cov)

line1 = m*np.array(x) + c

import matplotlib.pyplot as plt
plt.plot(x, line1, color="red", )
plt.plot(x, [0] * len(x), color="black")
plt.errorbar(x , y,  label="g1", fmt="s", markersize=5, color="red")
        
plt.xscale("log")
plt.xlabel("dummy x ")
plt.ylabel("dummy y")
plt.legend()
plt.tight_layout()
plt.show()

Basically, the fitting only works correctly (i.e. does a line fit) for some cases of x,y data and in other cases like in the code block above, will give some polynomial fit. I have tried feeding in different initial guesses for p0, but it really doesn't fix the issue--just alters the parameters a teeny bit. Ideally, I would prefer not to give a p0 since I want to pass a bunch of different data arrays in. Would love to solve this issue! I am also open to alternative fitting suggestions that would provide a covariance. Thanks everyone


Solution

  • "Some polynomial fit"? Perhaps you thought it was a polynomial of order 2 or higher because your output was curved, but that's actually just because you had log-scale. Remove the log scale from your output. Add the Jacobian because it's trivial. And in your errorbar... you weren't passing errors. Why?

    from scipy.optimize import curve_fit
    import numpy as np
    import matplotlib.pyplot as plt
    
    
    def line(x: np.ndarray, m: float, c: float) -> np.ndarray:
        return x*m + c
    
    
    def jacobian(x: np.ndarray, m: float, c: float) -> np.ndarray:
        return np.stack((
            x,                # dy/dm
            np.ones_like(x),  # dy/dc
        ), axis=1)
    
    
    x = np.array([
        10.55977951, 11.45686089, 12.35423473, 13.30408861, 14.3528772, 15.45035217,
        16.64817782, 17.99661252, 19.49184541, 21.18832373, 23.08427703, 25.27645953,
        27.91555605, 31.14343828, 35.16744841, 40.31943045, 47.22644384, 57.23413423,
        73.51606363,
    ])
    
    y = np.array([
        4.50873148e-04, 2.29554869e-04, 4.62602769e-04, 1.57525181e-04,
        1.16334543e-04, 9.42105291e-05, 2.86606379e-04, 2.40194287e-04,
        4.74193673e-04, 6.83848270e-04, 6.73286482e-04, 2.03506304e-04,
        3.58126867e-04, 1.88155439e-04, 5.14133854e-04, 2.39990293e-04,
        -3.60391884e-05, -1.17866329e-04, 2.68954649e-05,
    ])
    
    y_err = np.array([
        8.23397676e-05, 7.54222285e-05, 7.05355053e-05, 6.30493368e-05,
        5.73241555e-05, 5.56298862e-05, 5.00181328e-05, 4.76554758e-05,
        4.45081313e-05, 4.23716792e-05, 4.10516842e-05, 3.87066834e-05,
        3.67639901e-05, 3.51162489e-05, 3.39993704e-05, 3.29275562e-05,
        3.24743967e-05, 3.15296789e-05, 3.09144126e-05,
    ])
    
    (m, c), cov = curve_fit(f=line, jac=jacobian, xdata=x, ydata=y, sigma=y_err)
    
    fig, ax = plt.subplots()
    ax.plot(x, line(x, m, c), label='fit')
    ax.errorbar(x, y, y_err, fmt='s', label='g1')
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.legend()
    
    plt.show()
    

    fit