Search code examples
pythonregressionstatsmodels

VIF calculation in python statsmodel


The code for calculating VIF in statsmodel is below:

k_vars = exog.shape[1]
x_i = exog[:, exog_idx]
mask = np.arange(k_vars) != exog_idx
x_noti = exog[:, mask]
r_squared_i = OLS(x_i, x_noti).fit().rsquared   ## NO INTERCEPT
vif = 1. / (1. - r_squared_i)

When fitting, it does not include an intercept. It seems intercept should be included according to "Introductory Econometrics (6ed)" by Wooldridge: "... R-squared from regressing Xj on all other independent variables (and including an intercept)."

Is statmodels wrong? Is there another package I can cross check? Thanks.


Solution

  • When using statsmodels, always be mindful of adding constant (which is necessary in this case); quoting from the docs:

    An intercept is not included by default and should be added by the user. See statsmodels.tools.add_constant.

    Reference from MATLAB: https://www.mathworks.com/help/econ/examples/time-series-regression-ii-collinearity-and-estimator-variance.html