Each one of my variables is a list on its own.
I am using a method found on another thread here.
import numpy as np
import statsmodels.api as sm
y = [1,2,3,4,3,4,5,4,5,5,4,5,4,5,4,5,6,5,4,5,4,3,4]
x = [
[4,2,3,4,5,4,5,6,7,4,8,9,8,8,6,6,5,5,5,5,5,5,5],
[4,1,2,3,4,5,6,7,5,8,7,8,7,8,7,8,7,7,7,7,7,6,5],
[4,1,2,5,6,7,8,9,7,8,7,8,7,7,7,7,7,7,6,6,4,4,4]
]
def reg_m(y, x):
ones = np.ones(len(x[0]))
X = sm.add_constant(np.column_stack((x[0], ones)))
for ele in x[1:]:
X = sm.add_constant(np.column_stack((ele, X)))
results = sm.OLS(y, X).fit()
return results
My only problem being, that in my regression output, the explanatory variables are labelled x1, x2, x3 etc. Was wondering if it was possible to change these to more meaningful names?
Thanks
Searching through the source, it appears the summary()
method does support using your own names for explanatory variables. So:
results = sm.OLS(y, X).fit()
print results.summary(xname=['Fred', 'Mary', 'Ethel', 'Bob'])
gives us:
OLS Regression Results
==============================================================================
Dep. Variable: y R-squared: 0.535
Model: OLS Adj. R-squared: 0.461
Method: Least Squares F-statistic: 7.281
Date: Mon, 11 Apr 2016 Prob (F-statistic): 0.00191
Time: 22:22:47 Log-Likelihood: -26.025
No. Observations: 23 AIC: 60.05
Df Residuals: 19 BIC: 64.59
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [95.0% Conf. Int.]
------------------------------------------------------------------------------
Fred 0.2424 0.139 1.739 0.098 -0.049 0.534
Mary 0.2360 0.149 1.587 0.129 -0.075 0.547
Ethel -0.0618 0.145 -0.427 0.674 -0.365 0.241
Bob 1.5704 0.633 2.481 0.023 0.245 2.895
==============================================================================
Omnibus: 6.904 Durbin-Watson: 1.905
Prob(Omnibus): 0.032 Jarque-Bera (JB): 4.708
Skew: -0.849 Prob(JB): 0.0950
Kurtosis: 4.426 Cond. No. 38.6
==============================================================================
Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.