Search code examples
rlogistic-regressionsurvey

Reporting average marginal effects of a survey-weighted logit model with R


I'm working with survey data of a complex sample to estimate binary outcome models. I am trying to report average marginal effects of a logit model, which I estimated through svyglm of the survey package in R. However, I get the following error when I use margins from the package of the same name:

margins(fit, design = lapop) %>% summary()

Error in h(simpleError(msg, call)) : error in evaluating the argument 'object' in selecting a method for function 'summary': arguments imply differing number of rows: 6068, 6054

Seems it is not the summary function, since the error pops up when executing the margins command with its arguments. I have tried to simply ignore the survey weights at all and shows me equal coefficients and AMEs but not standard errors. Obviously, I cannot present this work by ignoring the survey weights. So I guess what I really need is the standard errors.

I have been reading on the topic and have found no clear solution, I suspect it might have something to do with missing values of the X in the model, but as with any other linear model, R should be just working with complete cases.

I'm not sure if anybody knows anything about this, or if I should simply just report AMEs without standard errors (and thus without p-values). I have uploaded a MWE if anyone is interested, which can be found here.


Solution

  • I found out what was happening with this: turns out it's some kind of mistake in the package's code. You can take a look at the specifics here. The solution as of today is to install Tomasz Żółtak's forks of the prediction and margins packages until his pull requests on Github are merged.

    devtools::install_github("tzoltak/prediction")
    devtools::install_github("tzoltak/margins")
    

    This must be done after installing the devtools package if you haven't already.

    install.packages('devtools')
    

    After doing this, running margins() on a model should no longer produce errors if the model's dataframe has missing values on some or all of the model's covariates. Thus, it will present average partial effects with their corresponding survey-weighted standard errors. Check out a MWE here.

    Hopefully in the future calling margins directly from CRAN will be enough to not produce this error with survey-weighted models.