Search code examples
rdplyrrlangnse

Storing and calling variables in a column in dplyr within a function


I want to store some variables within a column cell within a tibble. I then want to call that column and either paste the names of those variables or call that column and paste the columns which those variables correspond to together. In addition, all of this occurs within a function and this is the only piece of hard coding left so I'd really like to find a way to solve this.

library("tidyverse") 
myData<-tibble("c1"=c("a","b","c"),
"c2"=c("1","2","3"),
"c3"=c("A","B","C"),
factors=c(list(c("c1","c2")),list(c("c2","c3")),list(c("c1","c2","c3"))))

myData%>%mutate(factors1=interaction(!!!quos(factors),sep=":",lex.order=TRUE))
# A tibble: 3 x 5
  c1    c2    c3    factors   factors1
  <chr> <chr> <chr> <list>    <fct>   
1 a     1     A     <chr [2]> c1:c2:c1
2 b     2     B     <chr [2]> c2:c3:c2
3 c     3     C     <chr [3]> c1:c2:c3

So this allows me to concatenate the names of the variables but as you can see, if one list is longer than the others, it loops.

For the second problem in which I would like to use the $factors column to specifically call the values of other columns, I can hardcode this like so:

myData%>%
mutate(factors2=interaction(!!!syms(c("c1","c2")),sep=":",lex.order=TRUE))
# A tibble: 3 x 5
 c1    c2    c3    factors   factors2
 <chr> <chr> <chr> <list>    <fct>   
1 a     1     A     <chr [2]> a:1     
2 b     2     B     <chr [2]> b:2     
3 c     3     C     <chr [3]> c:3  

However if I try this:

myData%>%
mutate(factors2=interaction(!!!syms(factors),sep=":",lex.order=TRUE))

Error in lapply(.x, .f, ...) : object 'factors' not found

The same happens if I try to unlist the factors or use other rlang expressions. I have also tried nesting rlang expressions but so far haven't found one that works as I intended.

I feel like this is something that should be possible but so far I haven't found a question on stack overflow or a tutorial that indicates that it is so maybe I'm on a wild goose chase. Thank you all for your time and help.

My code in full:

library("tidyverse") 

myData<-tibble("c1"=c("a","b","c"),
"c2"=c("1","2","3"),
"c3"=c("A","B","C"),
factors=c(list(c("c1","c2")),list(c("c2","c3")),list(c("c1","c2","c3"))))%>%
mutate(factors1=interaction(!!!quos(factors),sep=":",lex.order=TRUE))%>%
mutate(factors2=interaction(!!!syms(factors),sep=":",lex.order=TRUE))

My desired output is:

    # A tibble: 3 x 6
 c1    c2    c3    factors   factors1   factors2
 <chr> <chr> <chr> <list>     <fct>      <fct>   
1 a     1     A     <chr [2]> c1:c2       a:1     
2 b     2     B     <chr [2]> c2:c3       2:B     
3 c     3     C     <chr [3]> c1:c2:c3    c:3:C  

Solution

  • Here is a method using map and imap:

    library(tidyverse)
    
    myData %>%
      mutate(factor1 = factors %>% map(~interaction(as.list(.), sep=':', lex.order = TRUE)) %>% unlist(),
             factor2 = factors %>% imap(~interaction(myData[.y, match(.x, names(myData))], sep=":", lex.order = TRUE)) %>% unlist())
    

    For factor1, instead of splicing the arguments into dots, I pass a list into interaction.

    For factor2, I match factors in each row with the names in myData and uses the column index (match(.x, names(myData))) in combination with the row index (.y from imap) to subset the appropriate elements to feed into interaction.

    Both factor1 and factor2 require an unlist because map and imap returns lists.

    Output:

    # A tibble: 3 x 6
      c1    c2    c3    factors   factor1  factor2
      <chr> <chr> <chr> <list>    <fct>    <fct>  
    1 a     1     A     <chr [2]> c1:c2    a:1    
    2 b     2     B     <chr [2]> c2:c3    2:B    
    3 c     3     C     <chr [3]> c1:c2:c3 c:3:C