Load dplyr
library(dplyr)
Set up simple data frame with one column
data <- data.frame(a = 1:5)
Define two functions
newfun1 <- function(x, val) {x + val}
newfun2 <- function(x, val) {x * val}
Store functions as named list
usefuns <- stats::setNames(as.list(c(newfun1, newfun2)), c("fun1", "fun2"))
usefuns
to data
column a
, specifying the val
argument should be 100
Using dplyr < 1.1.0
, I can make it work easily:
data %>% mutate(across(.col = a, .fns = usefuns, val = 100))
results of previous code; data frame with three columns
However, using dplyr 1.1.0
, I get this warning:
The
...
argument ofacross()
is deprecated as of dplyr 1.1.0. Supply arguments directly to.fns
through an anonymous function instead.
#Previously
across(a:b, mean, na.rm = TRUE)
#Now
across(a:b, ~(x) mean(x, na.rm = TRUE))
I can make it work with dplyr 1.1.0
using:
data %>% mutate(across(.col = a, .fns = list(fun1 = ~newfun1(.x, val = 100), fun2 = ~newfun2(.x, val = 100))))
or even
data %>% mutate(across(.col = a, .fns = list(fun1 = ~usefuns$fun1(.x, val = 100), fun2 = ~usefuns$fun2(.x, val = 100))))
results of previous code; data frame with three columns
but I know there must be a simpler way. In the real-world scenario that I'm using this, the number of functions contained in usefuns
will be variable, and there are several more arguments, but the arguments being passed to each function will always be the same.
I think I'm missing something relatively simple and have already wasted too much time experimenting. Any pointers are appreciated!
As an added note, val
may have differing values each time it is used:
Set up simple data frame with three columns
data <- data.frame(a = 1:5, b = 6:10, c = 11:15)
Example of more complicated application of functions using dplyr < 1.1.0
:
data %>% mutate(across(.col = c(a, b), .fns = usefuns, val = 100), across(.col = c, .fns = usefuns, val = 200))
I've tried variations on listing and naming functions, how they are stored and called on, started going down the path of using purrr
but couldn't get it as close to working as I did with the code provided above... I'm wondering if the partial()
function could come into play, but can't quite figure out how/if that would work.
You can pass your function list to purrr::map
/lapply
and then use purrr::partial
within and pass the value of val
.
library(purrr)
data %>%
mutate(across(.col = a,
.fns = purrr::map(usefuns,
purrr::partial,
val = 100)))
Or for the more complex example:
data <- data.frame(a = 1:5, b = 6:10, c = 11:15)
data %>%
mutate(across(.col = c(a, b),
.fns = map(usefuns, partial, val = 100)),
across(.col = c,
.fns = map(usefuns, partial, val = 200)))
a b c a_fun1 a_fun2 b_fun1 b_fun2 c_fun1 c_fun2
1 1 6 11 101 100 106 600 211 2200
2 2 7 12 102 200 107 700 212 2400
3 3 8 13 103 300 108 800 213 2600
4 4 9 14 104 400 109 900 214 2800
5 5 10 15 105 500 110 1000 215 3000