R : How to iterate sum across multiple columns?

I am trying to iterate/loop a sum across multiple non-consecutive columns. My objective is to compute the subscale score of multiple questionnaires measured repeatedly across time.

Dataset for one questionnaire of x items and n time-points:

df <- tibble(
  ID = 1:5,
  itemA_1 = sample(100, 5, TRUE),
  itemB_1 = sample(100, 5, TRUE),
  itemC_1 = sample(100, 5, TRUE),
  itemD_1 = sample(100, 5, TRUE),
  itemx_1 = sample(100, 5, TRUE),
  itemA_3 = sample(100, 5, TRUE),
  itemB_3 = sample(100, 5, TRUE),
  itemC_3 = sample(100, 5, TRUE),
  itemD_3 = sample(100, 5, TRUE),
  itemx_3 = sample(100, 5, TRUE),
  itemA_n = sample(100, 5, TRUE),
  itemB_n = sample(100, 5, TRUE),
  itemC_n = sample(100, 5, TRUE),
  itemD_n = sample(100, 5, TRUE),
  itemx_n = sample(100, 5, TRUE),
)

The sum for one specific time point works just fine:

df %>% mutate(total_1 = sum(c(itemA_1, itemC_1, itemD_1))

This loop does not work:

for (i in c(1, 3, n)) {
    df %>% mutate(total_i = sum(c(itemA_i, itemC_i, itemD_i))
    }

What am I doing wrong?

Solution

We may reshape to 'long' format with pivot_longer and do a group by sum

library(dplyr)
library(tidyr)
df1 <- df %>%
   pivot_longer(cols =-ID, names_to = c("item", ".value"), names_sep = "_") %>% 
  filter(item %in% c("itemA", "itemC", "itemD")) %>%
  group_by(ID) %>%
  summarise(across(where(is.numeric), sum, na.rm = TRUE,
       .names = "total_{.col}")) %>%
  left_join(df, .)

-output

> df1
# A tibble: 5 × 19
     ID itemA_1 itemB_1 itemC_1 itemD_1 itemx_1 itemA_3 itemB_3 itemC_3 itemD_3 itemx_3 itemA_n itemB_n itemC_n itemD_n itemx_n total_1
  <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>   <int>
1     1      69      27      56      44      54      53      66      28      67      19      65      38      12      45      33     250
2     2      31      65       7      34      84      19      64      70      27      23      98      65      94      71     100     221
3     3      58      34      68      18      69     100      24      47      54      60      47      48      81      61      22     247
4     4      95      16      85      34       9      28      73      57      79      60      57      31      16      24      84     239
5     5      19      66      43      25      35      31      39      17      15      84      10      23     100       6      74     188
# … with 2 more variables: total_3 <int>, total_n <int>

If we want to use the for loop, then paste the column names with i, evaluate (!!) while assigning (:=)

library(stringr)
for (i in c(1, 3, 'n')) {
     df <- df %>% 
   mutate(!! str_c("total_", i) :=   
      rowSums(across(all_of(str_c(c("itemA_", "itemC_", "itemD_"), i)))))
     }

But, note that this will not be dynamic as we have to manually include the 1, 2, ..., n in the loop

-checking the output from for loop and reshaping

> all.equal(df1$total_1, df$total_1)
[1] TRUE
> all.equal(df1$total_3, df$total_3)
[1] TRUE
> all.equal(df1$total_n, df$total_n)
[1] TRUE