Search code examples
rdplyrmutaterowsum

rowsums of specific columns filling a condition


I'm trying to sum specific columns of my data that fills some condition, like the example underneath

library(dplyr)
library(readr)
set.seed(123)
data=data.frame(id=1:4,
                v1=sample(c("a","b"),4,TRUE),
                v2=sample(c("a","b"),4,TRUE),
                v3=sample(c("a","b"),4,TRUE),
                v4=sample(c("a","b"),4,TRUE),
                v5=sample(c("a","b"),4,TRUE),
                v5=sample(c("a","b"),4,TRUE)
                )
data%>%
  rowwise()%>%
  mutate(across(v1:v4,~sum(.x=="a")))%>%
  mutate(n_a=sum(c(v1,v2,v3,v4)))
#> # A tibble: 4 × 8
#> # Rowwise: 
#>      id    v1    v2    v3    v4 v5    v5.1    n_a
#>   <int> <int> <int> <int> <int> <chr> <chr> <int>
#> 1     1     1     1     1     0 b     a         3
#> 2     2     1     0     1     1 a     b         3
#> 3     3     1     0     0     0 a     a         1
#> 4     4     0     0     0     1 a     a         1

here n_a is the sum of vars from v1 to v4 that have the value a could have a better implementation of my code ?

  • one mutate line with no transformation of other vars ?
  • can i use sum with something like v1:v4 ?

Created on 2023-07-28 with reprex v2.0.2


Solution

  • You may use rowSums with pick -

    library(dplyr)
    
    data %>%
      mutate(n_a = rowSums(pick(v1:v4) == "a", na.rm = TRUE))
    
    #  id v1 v2 v3 v4 v5 v5.1 n_a
    #1  1  a  a  a  b  b    a   3
    #2  2  a  b  a  a  a    b   3
    #3  3  a  b  b  b  a    a   1
    #4  4  b  b  b  a  a    a   1