Search code examples
rdplyrtidyverserlang

Is there a way to pass a string as a variable/column name to my function and use in a call to mutate?


I have a dataframe with a column indicating choices (of a survey) as well as a column indicating the index of the choice made in each row. e.g.,

df <- tibble(
  record_id = 1:9,
  choices = c(rep("1, A | 2, B | 3, C", 3), 
              rep("1, Apple | 2, Banana | 3, Cherry", 3),
              rep("1, America | 2, Belgium | 3, China", 3)),
  choice = sample(1:3, size = 9, replace = T)
)

Which looks like this:

# A tibble: 9 × 3
  record_id choices                            choice
      <int> <chr>                               <int>
1         1 1, A | 2, B | 3, C                      3
2         2 1, A | 2, B | 3, C                      2
3         3 1, A | 2, B | 3, C                      3
4         4 1, Apple | 2, Banana | 3, Cherry        3
5         5 1, Apple | 2, Banana | 3, Cherry        3
6         6 1, Apple | 2, Banana | 3, Cherry        2
7         7 1, America | 2, Belgium | 3, China      2
8         8 1, America | 2, Belgium | 3, China      3
9         9 1, America | 2, Belgium | 3, China      3

I would like to create a column recoding the choice by the label indicated in the choices column. e.g.,:

# A tibble: 9 × 3
  record_id choices                            choice   label
      <int> <chr>                               <int>   <chr>
1         1 1, A | 2, B | 3, C                      3       C
2         2 1, A | 2, B | 3, C                      2       B
3         3 1, A | 2, B | 3, C                      3       C
4         4 1, Apple | 2, Banana | 3, Cherry        3  Cherry
5         5 1, Apple | 2, Banana | 3, Cherry        3  Cherry
6         6 1, Apple | 2, Banana | 3, Cherry        2  Banana
7         7 1, America | 2, Belgium | 3, China      2 Belgium
8         8 1, America | 2, Belgium | 3, China      3   China
9         9 1, America | 2, Belgium | 3, China      3   China

So far I've created a function to recode a choice, but it doesn't work in a pipe to mutate:

make_key <- function(.str) {
    
  lstr <- str_split(.str, pattern = " \\| ")
  
  out <- map(lstr, ~str_remove(.x, pattern = "^([0-9]+), ")) %>% as_vector()
  
  out_names <- map(lstr, ~str_extract(.x, pattern = "^([0-9]+)")) %>% as_vector()
  
  names(out) <- out_names
  
  return(out)
}

# Working example:
my_string <- c("1, A | 2, B | 3, C")
recode(1, !!!make_key(my_string))

[1] "A"

But when I try to use it in a call to dplyr::mutate(), it doesn't work. I think it has something to do with passing a variable name to a function but not sure how.

rowwise(df) %>%
  mutate(label = recode(choice, !!!make_key(choices))
)

Error in stri_split_regex(string, pattern, n = n, simplify = simplify, : 
object 'choices' not found

I have tried adding double braces {{}} to lstr <- str_split({{.str}}, pattern = " \\| "), as well as some rlang functions to deal with the problem, e.g., .str <- rlang::as_name(.str) or .str <- rlang::enquo(.str), but so far nothing has worked.


Solution

  • what about:

    library(dplyr)
    
    pick_label <- \(choices, choice){
      frags <- unlist(strsplit(choices, ' \\| '))
      frags[grepl(paste0('^', choice), frags)] |>
        gsub(pattern = '^.*, *', replacement = '')
    }
    
    
    df |>
      rowwise() |> 
      mutate(label = pick_label(choices, choice))
    

    \(x) is a shorthand for function(x) in R 4.1 and higher