I have an R dataset called "imps" that contains multiple imputed datasets within it:
Within each of those data frames, there is a column (or variable) for gender (where gender=1 or gender=0).
I'm trying to figure out if there's a way for me to re-subset "imps" where all the data frames within it only contain observations depending on whether gender=1 or gender=0.
I understand how to do this if I only pick say one of those data frames, from which then I can run the subset function (i.e.):
imputed_data1 <- imps[[5]] #selecting the 5th imputed dataset
imputed_gender <- subset(imputed_data1, gender==1)
My issue is that I'm trying to keep all the data frames (there's hundreds of them), but I want to go inside each of them and only select observations where gender=1 or gender=0.
Is this possible to do? Any help would be much appreciated.
We can wrap with lapply
imps1 <- lapply(imps, subset, subset = gender == 1)
imps0 <- lapply(imps, subset, subset = gender == 0)
Or using tidyverse
library(dplyr)
library(purrr)
imps1 <- map(imps, ~ .x %>%
filter(gender == 1))