I get two different intercept values from using the statsmodels regression fit and the numpy polyfit. The model is a simple linear regression with a single variable.
From the statsmodels regression I use:
results1 = smf.ols('np.log(NON_UND) ~ (np.log(Food_consumption))', data=Data2).fit()
Where I recieve the following results:
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------------------
Intercept 5.4433 0.270 20.154 0.000 4.911 5.976
np.log(Food_consumption) 1.1128 0.026 42.922 0.000 1.062 1.164
When plotting the data and adding a trendline using numpy polyfit, I recieve a different intercept value:
x = np.array((np.log(Data2.Food_consumption)))
y = np.array((np.log(Data2.NON_UND)*100))
z = np.polyfit(x, y, 1)
array([ 1.11278898, 10.04846693])
How come I get two different values for the intercept?
Thanks in advance!
This is because you are using different linear models in the first and second regressions. In the first regression, you take logs of both the dependent and independent variables, while in the second regression, you are not, and additionally, you are multiplying y by 100.
In order to get the same results as the first regression in the second specification, you need to make sure the regression model is exactly the same as the first one. I suggest you do this:
x = np.log(np.array(((Data2.Food_consumption))))
y = np.log(np.array(((Data2.NON_UND))))
z = np.polyfit(x, y, 1)
And then the output you get with the second function should be the same as the one you get in the first one.