Search code examples
rpurrr

Mutate columns based on value in a vector


I would like to mutate new columns onto a dataframe using a function where the inputs to the function come from a vector. Like this:

library(tidyverse)

# sessionInfo()
# R version 4.3.2 (2023-10-31)

set.seed(123)  # For reproducibility

mydf <- data.frame(user_id = 1:100) |> 
  expand_grid(data.frame(day = 1:365)) |> 
  mutate(logins = floor(abs(rnorm(1,10,10))))

lambdas <- c(0.01, 0.05, 0.1)

mydf <- mydf |> 
  mutate(
    lambda_logins_01 = logins * exp(-lambdas[1] * day),
    lambda_logins_05 = logins * exp(-lambdas[2] * day),
    lambda_logins_1 = logins * exp(-lambdas[3] * day),
    )

Except, instead of writing out "mutate( lambda_logins_01 ..." I wanted to do it more elegantly using something like map.

I would like to use the native pipe and refer to cur_data() or equivilent, as opposed to refering to mydf, i.e. I want to use the pipe in a traditional sense, referring to the current state of data I'm working on.

Desired result would be similar as in my example code, with the new columns named based on the values of lambda, except I wouldnt have to write out each mutate line manually.


Solution

  • You could use outer:

    mydf %>%
       mutate(data.frame(logins * exp(outer(day, -lambdas))) %>% 
                setNames(str_c('lambda_logins_', lambdas)))
    
    # A tibble: 36,500 × 6
       user_id   day logins lambda_logins_0.01 lambda_logins_0.05 lambda_logins_0.1
         <int> <int>  <dbl>              <dbl>              <dbl>             <dbl>
     1       1     1      4               3.96               3.80              3.62
     2       1     2      4               3.92               3.62              3.27
     3       1     3      4               3.88               3.44              2.96
     4       1     4      4               3.84               3.27              2.68
     5       1     5      4               3.80               3.12              2.43
     6       1     6      4               3.77               2.96              2.20
     7       1     7      4               3.73               2.82              1.99
     8       1     8      4               3.69               2.68              1.80
     9       1     9      4               3.66               2.55              1.63
    10       1    10      4               3.62               2.43              1.47
    

    Edit: If you do not mind using the superseeded map_dfc

    mutate(mydf, map_dfc(set_names(lambdas), ~logins * exp(-.x*day)))