I am performing OLS regression with a country-specific effect (LSDV) on my panel data (dataframe). This is my result:
============================== OLSR With Dummies ==============================
OLS Regression Results
==============================================================================
Dep. Variable: ELC R-squared: 0.969
Model: OLS Adj. R-squared: 0.968
Method: Least Squares F-statistic: 1185.
Date: Fri, 11 Mar 2022 Prob (F-statistic): 0.00
Time: 10:13:02 Log-Likelihood: -5120.2
No. Observations: 5237 AIC: 1.051e+04
Df Residuals: 5101 BIC: 1.140e+04
Df Model: 135
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 2.8980 0.130 22.355 0.000 2.644 3.152
IWW 0.6373 0.011 55.687 0.000 0.615 0.660
GDPC 0.7915 0.016 50.249 0.000 0.761 0.822
CDD 0.0333 0.007 4.750 0.000 0.020 0.047
HDD 0.1124 0.008 14.793 0.000 0.097 0.127
TIME 0.3588 0.013 28.110 0.000 0.334 0.384
AGO -4.1382 0.147 -28.187 0.000 -4.426 -3.850
ALB -7.4068 0.166 -44.670 0.000 -7.732 -7.082
I am obtaining fitted value by df_results.fittedvalues
or df_results.predict(exog)
.
To ensure that I am doing correct calculation, I want to compare manually calculated y, with y_fittedvalue, for example for ALB: y = 0.637*IWW + 0.7915*GDPC + 0.0333*CDD + 0.1124*HDD + 0.3588*TIME + (2.8980-7.4068)
, but it shows slight different (2~3%) (y=5.68 and y_fittedvalue=5.79). I guess it comes from noise (error), but I couldn't find any source and evidence for that. I have appreciated it if anyone can help to explain what does it cause this difference. And if it comes from noise, how can I get noise value?
For your manual calculation, you are using rounded coefficients. To do this precisely, you should be doing:
df_results.params * exog