Search code examples
rdplyrtibbleacross

How can I use mutate to create columns named in a vector?


I have a tibble, and I want to add columns to it using a character vector. The names of the vector are to be the names of the new columns, and the new columns should be filled with the values of the vector (repeated for each row). This is easy to do with a for loop, but I'm trying to understand how across works and I'm having two problems.

cv <- c("a"="x", "b"="y", "c"="z")  
tib <- tibble(c1=1:5)
myf <- function(x) { cv[x]}
tib %>% mutate(across(all_of(names(cv))), myf)  ## first problem
#   Error: Problem with `mutate()` input `..1`.
#   x Can't subset columns that don't exist.
#   x Columns `a`, `b`, and `c` don't exist.
tib %>% mutate_at(all_of(names(cv)), myf)
# ℹ Input `..1` is `across(all_of(names(cv)))`.

for (x in names(cv)) { ## do it with a for loop
  tib[[x]] <- myf(x)
}
tib %>% mutate(across(all_of(names(cv)), myf)) ## second problem

which produces:

# A tibble: 5 x 4
     c1 a     b     c    
  <int> <chr> <chr> <chr>
1     1 NA    NA    NA   
2     2 NA    NA    NA   
3     3 NA    NA    NA   
4     4 NA    NA    NA   
5     5 NA    NA    NA   

Replacing the last line with tib %>% mutate_at(all_of(names(cv)), myf) produces the same incorrect behavior.

The first problem is that mutate across doesn't seem to like making new columns for some reason I can't understand. The second problem is that across doesn't know what to do with myf. It seems to want some kind of closure that I don't know how to create. (Same with mutate_at.) I've looked briefly at rlang but can't make heads or tails of how to convert a regular function into the appropriate kind of object.


Solution

  • The across can be used when the columns exist in the dataset and if we want to update those columns or create new columns from that column by specifying the .names to change the column name. Here, one method would be to loop over the names with map, create the columns with transmute and bind those with original data

    library(purrr)
    library(dplyr)
    map_dfc(names(cv), ~ tib %>%
                         transmute(!! .x := myf(.x))) %>%
           bind_cols(tib, .)
    

    -output

    # A tibble: 5 x 4
    #     c1 a     b     c    
    #  <int> <chr> <chr> <chr>
    #1     1 x     y     z    
    #2     2 x     y     z    
    #3     3 x     y     z    
    #4     4 x     y     z    
    #5     5 x     y     z