Search code examples
rdplyracross

how to programatically apply a transformation to multiple variables and keep the raw and the transformed variables with dplyr for R


I have a large dataset and I would like to apply some transformations in some variables programatically. To illustrate, say I want to apply the log to variables contained in a character vector. I would like to keep the input variables and generate a new variable prepending (or appending) a prefix (or suffix) for each variable of the character vector. Since a few lines of code is worth a thousand paragraphs, I basically aim to get the results as in df_aim in a less repetitive fashion, as for example, in df_syntax.

reprex

library(tidyverse)
data(mtcars)

vars_to_transf <- c("disp", "hp", "drat")

# these results 
df_aim <- mtcars %>% 
    mutate(
        ln_disp =  log(disp), 
        ln_hp   =  log(hp),
        ln_drat =  log(drat)
    )

# with something like this syntax 
df_syntax <- mtcars %>% 
    mutate(across(all_of(vars_to_transf), .fns =  log))
> head(df_aim)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb ln_disp ln_hp ln_drat
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4   5.075 4.700   1.361
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4   5.075 4.700   1.361
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1   4.682 4.533   1.348
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1   5.553 4.700   1.125
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2   5.886 5.165   1.147
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1   5.416 4.654   1.015
> head(df_syntax)
                   mpg cyl  disp    hp  drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6 5.075 4.700 1.361 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6 5.075 4.700 1.361 2.875 17.02  0  1    4    4
Datsun 710        22.8   4 4.682 4.533 1.348 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6 5.553 4.700 1.125 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8 5.886 5.165 1.147 3.440 17.02  0  0    3    2
Valiant           18.1   6 5.416 4.654 1.015 3.460 20.22  1  0    3    1

I appreciate your attention and apologise should this question be a duplicate.


Solution

  • You can use list:

    mtcars %>% 
        mutate(across(vars_to_transf, list(log = log)))
    

    And if you were attempting to use more than one function, using list and the .names will work:

    mtcars %>% 
        mutate(across(vars_to_transf, 
                      list(log = log, sqrt = sqrt), 
                      .names = "{.col}_{.fn}"))