Search code examples
pythonrpy2fixest

rpy2 in python with `R.feols(lnSO2 ~ tp, data = df) ` , I get the SyntaxError located on `~`


I am a student in Economics with little experience in R. I am trying to use python to call fixest::feols. But get some error, can anyone do me a favor? Here is the code of my project, which is a basic DID model. By the way, this is my first time to ask question in stack overflow, if you need more details, I can fix it.

import pandas as pd
from rpy2 import robjects as ro
from rpy2.robjects.packages import importr
from rpy2.robjects import pandas2ri
pandas2ri.activate()
from rpy2.robjects.conversion import localconverter
R = ro.r

base = importr('base')
# this is the package which contains feols function .
fixest = importr('fixest')


df = pd.read_stata(r'mydata.dta')
df = df[['lnSO2', 'tp', "year", 'firm_ID',"pac" ]].iloc[:100, :]

print(R.summary(df))
print(R.summary(R.lm("lnSO2~ tp", data=df)))
# error occurred. the right complete code in R is `feols(lnSO2 ~ 1 + tp |year+firm_ID, data=df, cluster=~pac) `
print(R.feols(lnSO2~ tp| year+firm_ID, data=df))

The error is showed as below:

  File "<ipython-input-19-02d37bf05a02>", line 1
    print(R.feols(lnSO2~ tp| year+firm_ID, data=df))
                       ^
SyntaxError: invalid syntax

I have tried print(R.feols("lnSO2~ tp| year+firm_ID", data=df)) but this can't get the right answer too and returned the same error tips in R: Error in feols("lnSO2 ~ 1 + tp |year+firm_ID", data = df, cluster = ~pac) : The argument 'fml' must be a two-sided formula. Problem: it is not a formula, not a data.frame nor a matrix (instead it is a vector).

Can anyone help me to get the same results? Or since LinearModels can only contain two fixed effects, anyone can give me an optional choice to handle the fixed effects model regression in python with 3 or more fixed effects?
Thanks and best wishes for you!


Solution

  • Try to creating Formula objects:

    Import the class:

    from rpy2.robjects import Formula
    

    Then do the following:

    R.lm(formula=Formula("lnSO2~ tp"), data=df)
    R.feols(fml=Formula("lnSO2~ tp| year+firm_ID"), data=df)