Search code examples
pythonstatisticsregressionstatsmodelsrobust

Trouble shooting robust regression model created from a OLS model in Statsmodel


I am having trouble running a robust regression model with Statsmodel in python.

The following OLS model works:

model_name = sm.ols(formula="depenent ~ var1 * var2 + var3", data=data).fit()

I tried running:

model_name= sm.RLM(formula="depenent ~ var1 * var2 + var3", data=data).fit()

but I get the following type error:

TypeError: __init__() missing 2 required positional arguments: 'endog' and 'exog'

I read through this documentation: https://www.statsmodels.org/dev/rlm.html but am still struggling with the code. I am open to using another package such as Scikit

Thank you.


Solution

  • The ols version should not work if sm is statsmodels.api. statsmodels.api only has OLS (capital letters for class name)

    The formula functions are lower case, i.e. rlm imported from statsmodels.formula.api. This is just an alias of the class method RLM.from_formula.

    RLM in capital letters is the name of the class which does not support formulas directly, and requires either numpy arrays or pandas DataFrames or Series.

    see for example http://www.statsmodels.org/devel/examples/notebooks/generated/formulas.html

    Note the formula.api lower case objects are simply defined as aliases, e.g. for OLS/ols and RLM/rlm

    import statsmodels.regression.linear_model as lm_
    import statsmodels.robust.robust_linear_model as roblm_
    
    ols = lm_.OLS.from_formula
    rlm = roblm_.RLM.from_formula