Search code examples

How to include interaction variables in logit statsmodel python?

I am working on Logistic regression model and I am using statsmodels api's logit. I am unable to figure out how to feed interaction terms to the model.


  • You can use the formula interface, and use the colon,: , inside the formula, for example :

    import statsmodels.api as sm
    import statsmodels.formula.api as smf
    import numpy as np
    import pandas
    df = pd.DataFrame(np.random.binomial(1,0.5,(50,3)),columns=['x1','x2','y'])
    res1 = smf.logit(formula='y ~ x1 + x2 + x1:x2', data=df).fit()
                               Logit Regression Results                           
    Dep. Variable:                      y   No. Observations:                   50
    Model:                          Logit   Df Residuals:                       46
    Method:                           MLE   Df Model:                            3
    Date:                Thu, 04 Feb 2021   Pseudo R-squ.:                 0.02229
    Time:                        10:03:59   Log-Likelihood:                -32.463
    converged:                       True   LL-Null:                       -33.203
    Covariance Type:            nonrobust   LLR p-value:                    0.6869
                     coef    std err          z      P>|z|      [0.025      0.975]
    Intercept     -0.9808      0.677     -1.449      0.147      -2.308       0.346
    x1             0.4700      0.851      0.552      0.581      -1.199       2.139
    x2             0.9808      0.863      1.137      0.256      -0.710       2.671
    x1:x2         -1.1632      1.229     -0.946      0.344      -3.572       1.246