Search code examples
pythonscikit-learnlinear-regressionstatsmodels

Access standardized residuals, cook's values, hatvalues (leverage) etc. easily in Python?


I am looking for influence statistics after fitting a linear regression. In R I can obtain them (e.g.) like this:

hatvalues(fitted_model) #hatvalues (leverage)
cooks.distance(fitted_model) #Cook's D values
rstandard(fitted_model) #standardized residuals
rstudent(fitted_model) #studentized residuals

etc.

How can I obtain the same statistics when using statsmodels in Python after fitting a model like this:

#import statsmodels
import statsmodels.api as sm

#Fit linear model to any dataset
model = sm.OLS(Y,X)
results = model.fit()

#Creating a dataframe that includes the studentized residuals
sm.regression.linear_model.OLSResults.outlier_test(results)

Edit: See answer below...


Solution

  • I found it here:

    http://www.statsmodels.org/dev/generated/statsmodels.stats.outliers_influence.OLSInfluence.summary_frame.html

    OLSInfluence.summary_frame()