Search code examples
pythonrstatsmodelsmixed-models

mixed-models with two random effects - statsmodels


import pandas as pd
import statsmodels.formula.api as smf

df = pd.read_csv('http://www.bodowinter.com/tutorial/politeness_data.csv')
df = df.drop(38)

In R I would do:

lmer(frequency ~ attitude + (1|subject) + (1|scenario), data=df)

which in R gives me:

Random effects:
 Groups   Name        Variance Std.Dev.
 scenario (Intercept)  219     14.80   
 subject  (Intercept) 4015     63.36   
 Residual              646     25.42   
Fixed effects:
            Estimate Std. Error t value
(Intercept)  202.588     26.754   7.572
attitudepol  -19.695      5.585  -3.527

I tried to do the same with statsmodels:

model = smf.mixedlm("frequency ~ attitude", data=df, groups=df[["subject","scenario"]]).fit()

But model.summary() gives me a different output:

      Mixed Linear Model Regression Results
=======================================================
Model:            MixedLM Dependent Variable: frequency
No. Observations: 83      Method:             REML     
No. Groups:       2       Scale:              0.0000   
Min. group size:  1       Likelihood:         inf      
Max. group size:  1       Converged:          Yes      
Mean group size:  1.0                                  
-------------------------------------------------------
                  Coef.  Std.Err. z P>|z| [0.025 0.975]
-------------------------------------------------------
Intercept        204.500                               
attitude[T.pol]    8.800                               
groups RE          0.000                               
=======================================================

Solution

  • The code below reproduces the R results. Since this is a crossed model with no independent groups, you need to put everyone in the same group and specify the random effects using variance components.

    import pandas as pd                                                                                                        
    import statsmodels.api as sm                                                                                               
                                                                                                                               
    df = pd.read_csv('http://www.bodowinter.com/uploads/1/2/9/3/129362560/politeness_data.csv')                                                 
    df = df.dropna()                                                                                                           
    df["group"] = 1                                                                                                            
                                                                                                                               
    vcf = {"scenario": "0 + C(scenario)", "subject": "0 + C(subject)"}                                                         
    model = sm.MixedLM.from_formula("frequency ~ attitude", groups="group",                                                    
                                    vc_formula=vcf, re_formula="0", data=df)                                                   
    result = model.fit()  
    

    Here are the results:

                Mixed Linear Model Regression Results
    ==============================================================
    Model:               MixedLM   Dependent Variable:   frequency
    No. Observations:    83        Method:               REML     
    No. Groups:          1         Scale:                646.0163 
    Min. group size:     83        Likelihood:           -396.7268
    Max. group size:     83        Converged:            Yes      
    Mean group size:     83.0                                     
    --------------------------------------------------------------
                     Coef.   Std.Err.   z    P>|z|  [0.025  0.975]
    --------------------------------------------------------------
    Intercept        202.588   26.754  7.572 0.000 150.152 255.025
    attitude[T.pol]  -19.695    5.585 -3.526 0.000 -30.641  -8.748
    scenario Var     218.991    6.476                             
    subject Var     4014.616  104.614                             
    ==============================================================