Search code examples
rregressionfixestr-marginaleffects

marginaleffects package gives "NaN" estimates when used with fixest package on large dataset


I am working on a 2sls IV model with interacting fixed effects, where the endogenous variable interacts with a binary variable. I am able to do this without any problems using feols() from the fixest package. I then try to get predictions for different sub-groups using the marginaleffects package, given its compatibility with fixest. However, this only works when I:

  1. limit my dataset to 50,000 observations, or
  2. do not use fixed effect interactions.

Unfortunately my dataset has over a million observations and I need to use the fixed effect interactions. With this the marginaleffects package results in NaN or NAs. I would be grateful for any advise on how I could proceed!

I used the below code:

#The main model
m1 = feols(y ~ x1 | fe1^fe2 |x2_end*x3_end ~ x2_iv*x3_end, data)

#Calculate marginal effects
avg_predictions(m1)
summary(marginaleffects(m1, variables = "x1"))
predictions(m1, newdata = datagrid(x2_end=unique, x3_end=unique))
predictions(m1, newdata = datagrid(x2_end=unique))
predictions(m1, newdata = datagrid(x3_end=unique))

#Alternative model spec
m2 = feols(y ~ x1 | fe1^fe2 |x2_end*x3_end ~ x2_iv*x3_end, data[1:50000,])
m3 = feols(y ~ x1 | fe1+fe2 |x2_end*x3_end ~ x2_iv*x3_end, data)

All the marginal effects command produced NaN or NA results for m1, but work for m2 & m3.


Solution

  • I have updated the development version of the marginaleffects package to return more informative error messages. You can install it with:

    remotes::install_github("vincentarelbundock/marginaleffects")
    

    Restart R completely for the changes to take effect.

    Then, you’ll hopefully see that the error message tells you exactly what you must do to fix the problem:

    library(marginaleffects)
    library(fixest)
    ptdata = readRDS("ptdata_vab.rds")
    
    baseint_fe = feols(inventor_foc_appNtk5 ~ prior_rev_n | examiner_art_unit^appYear |
        ioc * gender_code ~ ear * gender_code, ptdata)
    
    predictions(baseint_fe, by = c("ioc", "gender_code"))
    # Error: Unable to compute predicted values with this model. You can try to
    #   supply a different dataset to the `newdata` argument.  This error was
    #   also raised:
    #   
    #   Error in predict.fixest(object = model, newdata = newdata, type = type)
    #   : 
    #   You cannot use predict() based on the initial regression since the
    #   fixed-effect 'examiner_art_unit^appYear' was combined using an algorithm
    #   dropping the FE values (but fast). Please re-run the regression using
    #   the argument 'combine.quick=FALSE'.
    # 
    #   
    #   Bug Tracker:
    #   https://github.com/vincentarelbundock/marginaleffects/issues
    

    Follow the error message instructions and fit the model again with combine.quick=FALSE:

    baseint_fe = feols(inventor_foc_appNtk5 ~  prior_rev_n |examiner_art_unit^appYear|
                       ioc*gender_code ~ ear*gender_code, ptdata,
        combine.quick=FALSE)
    predictions(baseint_fe, by = c("ioc","gender_code"))
    # 
    #           ioc gender_code Estimate Std. Error      z Pr(>|z|)    S 2.5 % 97.5 %
    #  Successful             F     3.00     0.0115 261.15   <0.001  Inf  2.97   3.02
    #  Successful             M     3.06     0.0665  45.93   <0.001  Inf  2.93   3.19
    #  Unsuccessful           F     1.66     0.2709   6.13   <0.001 30.1  1.13   2.19
    #  Unsuccessful           M     2.26     0.2071  10.91   <0.001 89.6  1.85   2.66
    # 
    # Columns: ioc, gender_code, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high 
    # Type:  response