Search code examples
rlistfor-loopdplyrfilter

Paste string from list into dplyr filter inside for loop in R


I'm trying to create a loop that outputs a descriptive table for me for different combinations of variables in order to compare their distribution. I've figured out how to iterate through my list and pass the variable into a group_by() statement, but this fails in the filter() statement. Does anyone have an idea of how to do this?

ID<-rep(c(1,2,3,4,5,6,7,8,9,10),10)
Black<-rep(c(0,1,0,0,0,1,1,0,1,1),10)
Asian<-rep(c(0,1,0,1,0,0,0,1,0,0),10)
Hispanic<-rep(c(1,0,0,0,1,0,0,0,1,0),10)
White<-rep(c(0,0,1,0,0,0,1,1,0,0),10)
Age1<-rep(c(0,5,43,25,31,22,17,12,59,25),10)
PTB<-rep(c(0,1,0,0,1,0,1,0,0,1, 1,0,1,1,0,1,0,1,1,0),5)
data1<-data.frame(ID, Black, Asian, Hispanic, White, Age1, PTB)

data1<-data1 %>%
  mutate(Black_Hispanic=ifelse(Black==1 & Hispanic==1, 1, 0),
         Asian_Hispanic=ifelse(Asian==1 & Hispanic==1, 1, 0),
         White_Hispanic=ifelse(White==1 & Hispanic==1, 1, 0),
         Black_Asian=ifelse(Black==1 & Asian==1, 1, 0),
         Black_White=ifelse(Black==1 & White==1, 1, 0),
         Asian_White=ifelse(Asian==1 & White==1, 1, 0))

transformed<-list()
Age<-list()

try<-data1 %>%
  dplyr::select(Black_Hispanic, Asian_Hispanic, White_Hispanic, Black_Asian, Black_White, Asian_White, 
                Black, White, Asian, Hispanic)
list_names<-names(try)

for (k in seq_along(list_names)){
  transformed[[k]]<- data1 %>%
    group_by(paste(list_names[k]), PTB) %>%
    mutate(mean_age=mean(Age1, na.rm=TRUE),
           sd_age=sd(Age1, na.rm=TRUE),
           min_age=min(Age1, na.rm=TRUE),
           max_age=max(Age1, na.rm=TRUE),
           total_n=n()) %>%
    ungroup()
  
  Age[[k]]<-transformed[[k]] %>%
    filter(paste(list_names[k])==1) %>%
    distinct(PTB, mean_age,sd_age,min_age, max_age)
}

Solution

  • You need to see the dplyr Vignettes on programming. See the section: "Loop over multiple variables". You will need to use the .data[[x]] pronoun.

    for (k in seq_along(list_names)){
       transformed[[k]]<- data1 %>%
          group_by(paste(list_names[k]), PTB) %>%
          mutate(mean_age=mean(Age1, na.rm=TRUE),
                 sd_age=sd(Age1, na.rm=TRUE),
                 min_age=min(Age1, na.rm=TRUE),
                 max_age=max(Age1, na.rm=TRUE),
                 total_n=n()) %>%
          ungroup()
       
       Age[[k]]<-transformed[[k]] %>%
          filter(.data[[ list_names[k] ]]==1) %>%
          distinct(PTB, mean_age,sd_age,min_age, max_age, Black_Hispanic, Asian_Hispanic, White_Hispanic, Black_Asian, Black_White, Asian_White, 
                   Black, White, Asian, Hispanic)
    }
    

    I added columns to ensure the filtering is working correctly. Please remove them in your production code.