Search code examples
pythonnumpyregressioncurve-fittingscientific-computing

Calculation of R2


I try to find the coefficient of determination (R2) with thes values : valeur_T= [45, 77, 102]

valeur_min = [55, 80, 105]

I try to calculate R2 but I always find the same result

        P2 = polyfit(valeur_T,valeur_min, 2)
        p= poly1d(P2)
        yhat = p(valeur_T)
        ybar = sum(valeur_min)/len(valeur_min)
        SST = sum((valeur_min - ybar)**2)
        SSreg = sum((yhat - ybar)**2)

        R2 = SSreg/SST

SST and SSreg always have the same values and R2=1

where is my error ?


Solution

  • You are fitting a second-order polynomial through 3 points, so naturally you get a perfect fit (R2=1). Your other errors seem to stem from your use of regular Python lists instead of NumPy arrays which support vectorized operations such as the one you want to carry out here:

    SST = sum((valeur_min - ybar)**2)
    

    Adding an extra data point and modifying your code to support NumPy throughout,

    import numpy as np
    
    valeur_T= np.array([45., 77, 102, 110])
    valeur_min = np.array([55., 80, 105, 122.])
    
    P2 = np.polyfit(valeur_T,valeur_min, 2)
    p= np.poly1d(P2)
    yhat = p(valeur_T)
    ybar = sum(valeur_min)/len(valeur_min)
    SST = sum((valeur_min - ybar)**2)
    SSreg = sum((yhat - ybar)**2)
    
    R2 = SSreg/SST
    print R2
    

    gives

    0.993316215465
    

    But only you can say whether this adapted code will suit your use-case, of course.