I'm trying to create a loop that outputs a descriptive table for me for different combinations of variables in order to compare their distribution. I've figured out how to iterate through my list and pass the variable into a group_by() statement, but this fails in the filter() statement. Does anyone have an idea of how to do this?
ID<-rep(c(1,2,3,4,5,6,7,8,9,10),10)
Black<-rep(c(0,1,0,0,0,1,1,0,1,1),10)
Asian<-rep(c(0,1,0,1,0,0,0,1,0,0),10)
Hispanic<-rep(c(1,0,0,0,1,0,0,0,1,0),10)
White<-rep(c(0,0,1,0,0,0,1,1,0,0),10)
Age1<-rep(c(0,5,43,25,31,22,17,12,59,25),10)
PTB<-rep(c(0,1,0,0,1,0,1,0,0,1, 1,0,1,1,0,1,0,1,1,0),5)
data1<-data.frame(ID, Black, Asian, Hispanic, White, Age1, PTB)
data1<-data1 %>%
mutate(Black_Hispanic=ifelse(Black==1 & Hispanic==1, 1, 0),
Asian_Hispanic=ifelse(Asian==1 & Hispanic==1, 1, 0),
White_Hispanic=ifelse(White==1 & Hispanic==1, 1, 0),
Black_Asian=ifelse(Black==1 & Asian==1, 1, 0),
Black_White=ifelse(Black==1 & White==1, 1, 0),
Asian_White=ifelse(Asian==1 & White==1, 1, 0))
transformed<-list()
Age<-list()
try<-data1 %>%
dplyr::select(Black_Hispanic, Asian_Hispanic, White_Hispanic, Black_Asian, Black_White, Asian_White,
Black, White, Asian, Hispanic)
list_names<-names(try)
for (k in seq_along(list_names)){
transformed[[k]]<- data1 %>%
group_by(paste(list_names[k]), PTB) %>%
mutate(mean_age=mean(Age1, na.rm=TRUE),
sd_age=sd(Age1, na.rm=TRUE),
min_age=min(Age1, na.rm=TRUE),
max_age=max(Age1, na.rm=TRUE),
total_n=n()) %>%
ungroup()
Age[[k]]<-transformed[[k]] %>%
filter(paste(list_names[k])==1) %>%
distinct(PTB, mean_age,sd_age,min_age, max_age)
}
You need to see the dplyr Vignettes on programming. See the section: "Loop over multiple variables". You will need to use the .data[[x]]
pronoun.
for (k in seq_along(list_names)){
transformed[[k]]<- data1 %>%
group_by(paste(list_names[k]), PTB) %>%
mutate(mean_age=mean(Age1, na.rm=TRUE),
sd_age=sd(Age1, na.rm=TRUE),
min_age=min(Age1, na.rm=TRUE),
max_age=max(Age1, na.rm=TRUE),
total_n=n()) %>%
ungroup()
Age[[k]]<-transformed[[k]] %>%
filter(.data[[ list_names[k] ]]==1) %>%
distinct(PTB, mean_age,sd_age,min_age, max_age, Black_Hispanic, Asian_Hispanic, White_Hispanic, Black_Asian, Black_White, Asian_White,
Black, White, Asian, Hispanic)
}
I added columns to ensure the filtering is working correctly. Please remove them in your production code.