Search code examples
rdataframetidyversedry

How can I reduce this code with the DRY principle?


I am practicing the DRY principle in my R code and I have reached this point where I have not managed to reduce the amount of lines of code. I see that it is very repetitive and I would like your help.

Here is a reproducible example:

library(tidyverse)
set.seed(2023)

# first, I generate the data
data <- as.data.frame(cbind(
  replicate(10, sample(0:1, 7, replace = TRUE)),
  replicate(10, sample(30:100, 7, replace = TRUE))
))

names(data) <- c(sprintf("var1_%02d", 1:10), sprintf("var2_%02d", 1:10))

data
#   var1_01 var1_02 var1_03 var1_04 var1_05 var1_06 var1_07 var1_08 var1_09 var1_10 var2_01 var2_02 var2_03 var2_04 var2_05 var2_06 var2_07 var2_08 var2_09 var2_10
# 1       0       1       0       0       0       0       0       0       0       0      61      72      74      58      85      93      85      46      99      55
# 2       1       1       0       1       0       0       0       1       1       0      66      56      91      72      77      53      61      34      57      43
# 3       0       0       1       1       1       1       0       1       1       1      71      89      49      99      38      84      53      41      95      64
# 4       0       0       0       0       1       0       1       1       1       1      50      91      83      61      81      41      71      83      96      81
# 5       1       0       1       1       1       1       1       1       0       1      41      61      79      67      96      98      97      60      36      90
# 6       0       0       0       1       1       1       1       1       1       1      60      93      39      86      53      82      69      39      67      54
# 7       1       0       0       0       1       0       0       1       1       0      57      96      82      47      95      41     100      53      98      45

This is the code I want to reduce:

data %<>%
  mutate(var3_01 = case_when(var1_01 == 1 ~ var2_01 + 0, TRUE ~ 0),
         var3_02 = case_when(var1_02 == 1 ~ var2_02 + 0, TRUE ~ 0),
         var3_03 = case_when(var1_03 == 1 ~ var2_03 + 0, TRUE ~ 0),
         var3_04 = case_when(var1_04 == 1 ~ var2_04 + 0, TRUE ~ 0),
         var3_05 = case_when(var1_05 == 1 ~ var2_05 + 0, TRUE ~ 0),
         var3_06 = case_when(var1_06 == 1 ~ var2_06 + 0, TRUE ~ 0),
         var3_07 = case_when(var1_07 == 1 ~ var2_07 + 0, TRUE ~ 0),
         var3_08 = case_when(var1_08 == 1 ~ var2_08 + 0, TRUE ~ 0),
         var3_09 = case_when(var1_09 == 1 ~ var2_09 + 0, TRUE ~ 0),
         var3_10 = case_when(var1_10 == 1 ~ var2_10 + 0, TRUE ~ 0))

The goal is that if the var1_* == 1, it takes the value of var2_* for each row. However, I have not been able to replicate this code in a shorter version (tidyverse or base version doesn't matter). I tried this:

numbers <- c(paste0("0", 1:5))

data %<>% 
  mutate(across(starts_with("var1_"), ~ifelse(isTRUE(.x==1), .x:=data[, 6:10], 0), .names="var3_{numbers}"))

But this code does not generate the same result as the extended version. I appreciate any suggestion!

EDIT: Thank you all for your suggestions and for editing the reproducible example. I WAS ABLE TO SOLVE MY DOUBTS and I learned a lot with your answers. Best wishes to all!


Solution

  • Staying within tidyverse

    You can use across, using get to use within case_when to relieve us from repetition.

    cols = names(data)[1:10]
    
    data |> 
      mutate(across({cols}, \(x){
        ifelse(x == 1, get(sub("var1", "var2", cur_column())), 0)
        }, .names = "{sub('var1', 'var3', .col)}"))
    
      var1_01 var1_02 var1_03 var1_04 var1_05 var1_06 var1_07 var1_08 var1_09 var1_10 var2_01 var2_02 var2_03 var2_04
    1       0       0       1       1       1       0       0       1       1       1      31      74      42      60
    2       0       1       0       0       1       0       1       0       1       1      92      63      57      98
    3       1       1       0       1       0       0       0       1       1       0      53      89      64      42
    4       0       1       0       0       0       1       0       1       1       1      55      37      41      97
    5       0       0       0       0       1       1       0       0       0       1      47      87      56      60
    6       0       0       1       0       1       0       0       0       0       1      99      73      79      31
    7       1       0       0       1       0       0       0       1       1       0      61      44      52      90
      var2_05 var2_06 var2_07 var2_08 var2_09 var2_10 var3_01 var3_02 var3_03 var3_04 var3_05 var3_06 var3_07 var3_08
    1      60      55      57      67      97      40       0       0      42      60      60       0       0      67
    2      97      78      74      30      90      49       0      63       0       0      97       0      74       0
    3      77      43      52      84      43      78      53      89       0      42       0       0       0      84
    4      95      94      65      86      32      82       0      37       0       0       0      94       0      86
    5      47      65     100      70      91      40       0       0       0       0      47      65       0       0
    6      93      77      92      57      76      93       0       0      79       0      93       0       0       0
    7      46     100      74      35      38      56      61       0       0      90       0       0       0      35
      var3_09 var3_10
    1      97      40
    2      90      49
    3      43       0
    4      32      82
    5       0      40
    6       0      93
    7      38       0