Search code examples
pythonstatisticsscipycurve-fittingdata-analysis

Scipy curve_fit returns negative variance


I'm trying to use curvefit to fit my data to two logarithms.

from scipy.optimize import curve_fit

def func_2log(x, a, b, c, d, e):
    return a*np.log(x + 1 + b) +c*np.log(x + 1 + d) + e

p, cov = curve_fit(func_2log, x, y, p0 = initial_guess, sigma = yerr, maxfev = 100000)

However, when I print the covariance matrix, I get:

[[ -2.23883493e+06 -3.92893042e+09 -1.87846128e+01 -4.27371608e+02 2.46442543e+07]

[ -3.92893042e+09 -6.89487710e+12 -3.29629278e+04 -7.49919180e+05 4.32481902e+10]

[ -1.87846014e+01 -3.29629077e+04 -1.53314974e-04 -3.43241587e-03 2.06772574e+02]

[ -4.27371198e+02 -7.49918462e+05 -3.43241462e-03 -7.58284163e-02 4.70429813e+03]

[ 2.46442543e+07 4.32481902e+10 2.06772700e+02 4.70430264e+03 -2.71274697e+08]]

How is this matrix mathematically possible? How can a parameter be negatively correlated with itself??

Edit: I didn't want to include the data itself because x and y have about 90,000 elements. x goes from 0 to 87244 in steps of 1 and y ranges from 2 to 7.

Thanks for any help in advance.


Solution

  • The computation inside curve_fit for the covariance contains this:

    cov_x = inv(dot(transpose(R), R))
    

    Where R^T R is an approximation to the hessian produced by the optimization algorithm. Looks positive definite, right?

    The result indeed is positive definite in exact arithmetic. However, what probably is happening in your case is that the approximation has a high condition number, so rounding errors in computing the inverse result to loss of positive definiteness. Indeed, the condition number of the matrix you give above is ~ 10^21.

    If so, what this probably means in practice is that the estimated variance is infinite for some linear combination of the parameters, which prevents obtaining reliable estimates for any of the covariances.

    (For example, if the best fit is obtained for a = 0, b becomes ill-defined which may ruin the covariance estimates.)