Search code examples
rdataframedplyrnon-standard-evaluation

Use nonstandard evaluation and mutate in r dataframe with dplyr


I'm having trouble figuring out how to use a string containing R data frame column names to do some basic calculations to mutate into new columns. For example, I have columns of baseline values and other columns with post-treatment timepoints. I want to use strings of column names for this because I will be using data with different timepoints specified and I want a programmatic solution.

For example, I have this data frame, and I think I need to use some of the syntax in my mutate line below, but can't figure out exactly how to write the right hand side. I want columns called 'day1_fc' and 'day2_fc' to represent the fold change of day1/baseline, and day2/baseline.

df <- data.frame(day0 = c(1,1,1),
                 day1 = c(2,3,4),
                 day2 = c(3,4,5))

baseline = 'day0'
sym_baseline <- sym(baseline)

post = c('day1', 'day2')
post1 <- post[1]
post2 <- post[2]

df %>% 
  mutate(!!paste0(post1, '_fc' := ?????),
         !!paste0(post2, '_fc') := ?????)

I want the result to look like:

  df <- data.frame(day0 = c(1, 0.5, 2),
                   day1 = c(2, 3, 4),
                   day2 = c(3, 4, 5),
                   day1_fc = c(2, 6, 2),
                   day2_fc = c(3, 8, 2.5))

Solution

  • You can use :

    library(dplyr)
    library(rlang)
    
    df %>% 
      mutate(!!paste0(post1, '_fc') := !!sym(post[1])/!!sym_baseline,
             !!paste0(post2, '_fc') := !!sym(post[2])/!!sym_baseline)
    
    #  day0 day1 day2 day1_fc day2_fc
    #1  1.0    2    3       2     3.0
    #2  0.5    3    4       6     8.0
    #3  2.0    4    5       2     2.5
    

    A general solution for many values of post would be using map :

    bind_cols(df, purrr::map_dfc(post, 
                 ~df %>% transmute(!!paste0(.x, '_fc') := !!sym(.x)/!!sym_baseline)))