Search code examples
rmarginal-effects

Marginal Effect from svyglm object with a subsample in R


I need to compute marginal effects out of a Generalized Linear Model (family=Poisson) estimated via the svyglm function from the R package survey for a subsample.

First, I declared the survey desgin with:

myDesisgn = svydesign(id=data$id, strata=data$strata, weights=data$sw, data=data)

Second, I estimated my model as:

fit = svyglm(y~ x1 +x2,  design=myDesisgn, data=data, subset= x3 == 1, family= poisson(link = "log"))

Finally, when I want to get the Average Marginal Effect for, let's say, x1 I run:

summary(margins(fit, variables = "x1", design=myDesisgn))

... but I get the following error message:

"Error in h(simpleError(msg, call)) : error in evaluating the argument 'object' in selecting a method for function 'summary': 'x' and 'w' must have the same length"

Running the following does not work either:

summary(margins(fit, variables = "x1", design=myDesisgn, subset=x3==1))

Solution

  • Solution:

    summary(margins(fit, variables = "x1", design=myDesisgn[myDesisgn$variables$x3 == 1]))
    

    Subsetting complex surveys leads to problems in the error estimation. When interested in a parameter for a specific subsample, one should use the desired subsample to estimate the parameter of interest and the full sample for the estimation of its error.

    For example, svyglm(y~x, data=data, subset = z == 1) does exactly this (beta_hat estimated using observations for which z=1 and se(beta_hat) using the full sample).

    Subsetting a svy design is possible and it keeps the original design information about number of clusters, strata. The code shown above is the "manual" way of doing so. Alternative one can directly rely on the subset.survey.design {survey} function.

    myDesign_subset <- subset(myDesign, data$x3 == 1)
    

    The two methods are equivalent and produce correct z-stats.