Search code examples
rsurveydecompositionoaxaca

Oaxaca decomposition on weighted survey data in R


I'd like to implement Oaxaca Decomposition in R. It is used in e.g. labor economics to distinguish explained variance versus unexplained variance.

This is fairly easy to do with unweighted data using the Oaxaca package (See previous explanation here for general oaxaca usage). However, the Oaxaca package does not currently support weighted survey data such as the Current Population Survey.

"Survey" is most popular package for dealing with survey data in R, but it lacks the capability to straightforwardly perform a Oaxaca decomposition.

Below is an example that notes the apparent limitations of the two packages:

# Note the lack of support for a "Oaxaca decomposition command":
library(survey)
data(api)
# The line below weights the data
dclus2<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2)
model0<-svyglm(I(sch.wide=="Yes")~ell+meals+mobility, design=dclus2, 
family=quasibinomial())

# Note the lack of support for survey weights:
library(oaxaca)
data("chicago")
# The line below will not work if my data is a survey.design object (i.e. weighted data)
oaxaca.results <- oaxaca(ln.real.wage ~ age + female + LTHS + some.college 
+ college + advanced.degree | foreign.born, data = chicago, R = 50)

If anyone is able to either tell me how oaxaca package can be made compatible with survey weights or how to implement a oaxaca decomposition command on a survey.design object, then that would be much appreciated. Any pointers?


Solution

  • The "decr" library has some scripts that can weight and perform decompositions on the weighted data.