Search code examples
rdplyracross

How to mutate multiple columns as function of multiple columns systematically?


I have a tibble with a number of variables collected over time. A very simplified version of the tibble looks like this.

df = tribble(
~id, ~varA.t1, ~varA.t2, ~varB.t1, ~varB.t2,
'row_1', 5, 10, 2, 4,
'row_2', 20, 50, 4, 6
)

I want to systematically create a new set of variables varC so that varC.t# = varA.t# / varB.t# where # is 1, 2, 3, etc. (similarly to the way column names are setup in the tibble above).

How do I use something along the lines of mutate or across to do this?


Solution

  • You can do something like this with mutate(across..., however, for renaming columns there must be a shortcut.

    df %>% 
      mutate(across(.cols = c(varA.t1, varA.t2),
                    .fns = ~ .x / get(glue::glue(str_replace(cur_column(), "varA", "varB"))),
                    .names = "V_{.col}")) %>%
      rename_with(~str_replace(., "V_varA", "varC"), starts_with("V_"))
    
    # A tibble: 2 x 7
      id    varA.t1 varA.t2 varB.t1 varB.t2 varC.t1 varC.t2
      <chr>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
    1 row_1       5      10       2       4     2.5    2.5 
    2 row_2      20      50       4       6     5      8.33
    

    If there is a long time series you can also create a vector for .cols beforehand.