I am trying to add columns using rowSums
because it is a fast function, using cur_column because the select columns depend on the initial columns a_value
b_value
. This is a reprex, so it needs to use rowSums
cur_column
or something similar because it will be used on thousands of rows and ~ 16 columns.
input:
library(tidyverse)
df <- tibble(
a_value = c(1, 2, 3),
b_value = c(4, 5, 6),
a_param1 = c(7, 8, 9),
b_param1 = c(10, 11, 12),
a_param2 = c(13, 14, 15),
b_param2 = c(16, 17, 18),
a_param3 = c(19, 20, 21),
b_param3 = c(22, 23, 24),
a_param4 = c(25, 26, 27),
b_param4 = c(28, 29, 30)
)
result of input.
# A tibble: 3 x 10
a_value b_value a_param1 b_param1 a_param2 b_param2 a_param3 b_param3 a_param4 b_param4
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 4 7 10 13 16 19 22 25 28
2 2 5 8 11 14 17 20 23 26 29
3 3 6 9 12 15 18 21 24 27 30
desired output. The a|b
value_value_exc_parami
i+1
is the sum of a|b
_parami
and a|b
_parami+1
# A tibble: 3 x 14
a_value b_value a_param1 b_param1 a_param2 b_param2 a_param3 b_param3 a_param4 b_param4 a_value_exc_param_12 b_value_exc_param_12 a_value_exc_param_34 b_value_exc_param_34
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 4 7 10 13 16 19 22 25 28 20 26 44 50
2 2 5 8 11 14 17 20 23 26 29 22 28 46 52
3 3 6 9 12 15 18 21 24 27 30 24 30 48 54
What I tried so far, it does not work:
df %>%
mutate(
across(
contains("_value"),
list(
param_12 = rowSums(~ select(.,
get(str_replace(cur_column(), "_value", "_param1")),
get(str_replace(cur_column(), "_value", "_param2"))
)
),
param_34 = rowSums(~ select(.,
get(str_replace(cur_column(), "_value", "_param3")),
get(str_replace(cur_column(), "_value", "_param4"))
)
),
.names = "{.col}_exc_{.fn}"
)
)
)
There might be another easy and efficient approach I'm not aware of, feel free to provide, thanks.
You can use the following :
library(dplyr)
library(stringr)
df %>%
transmute(across(contains("_value"),
list(param_12 = ~rowSums(select(cur_data(),
str_replace(cur_column(), "_value", "_param1"),
str_replace(cur_column(), "_value", "_param2"))),
param_34 = ~rowSums(select(cur_data(),
str_replace(cur_column(), "_value", "_param3"),
str_replace(cur_column(), "_value", "_param4")))),
.names = "{.col}_exc_{.fn}"))
# a_value_exc_param_12 a_value_exc_param_34 b_value_exc_param_12 b_value_exc_param_34
# <dbl> <dbl> <dbl> <dbl>
#1 20 44 26 50
#2 22 46 28 52
#3 24 48 30 54
I have used transmute
instead of mutate
here to show the output. In your real case, you can of course use mutate
to append these new columns.
We can write a function to avoid repetition of code.
apply_sum <- function(data, col, val, replace) {
rowSums(select(data, str_replace(col, val, replace)))
}
df %>%
transmute(across(contains("_value"),
list(
param_12 = ~apply_sum(cur_data(), cur_column(), 'value', c('param1', 'param2')),
param_34 = ~apply_sum(cur_data(), cur_column(), 'value', c('param3', 'param4'))),
.names = "{.col}_exc_{.fn}"))