Search code examples
rdplyrdata-management

Sum rows by group with condition in R


I have a dataset in R like this one:

enter image description here

and I want to keep the same dataset with adding a column that gives the sum rows by ID when A=B=1.

This is the required dataset:

enter image description here

I tried the following R code but it doesn't give the result I want:

library(dplyr)

data1<-data%>% group_by(ID) %>% 
  mutate(result=case_when(A==1 & B==1 ~ sum(A),TRUE ~ 0)) %>% ungroup()

Solution

  • After grouping by 'ID', multiply the 'A' with 'B' (0 values in B returns 0 in A) and then get the sum

    library(dplyr)
    data  %>%
         group_by(ID) %>%
         mutate(result = sum(A*B)) %>%
         ungroup
    

    -output

    # A tibble: 10 × 4
          ID     A     B result
       <dbl> <dbl> <dbl>  <dbl>
     1     1     1     0      3
     2     1     1     1      3
     3     1     0     1      3
     4     1     0     0      3
     5     1     1     1      3
     6     1     1     1      3
     7     2     1     0      2
     8     2     1     1      2
     9     2     1     1      2
    10     2     0     0      2
    

    data

    data <- structure(list(ID = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2), A = c(1, 
    1, 0, 0, 1, 1, 1, 1, 1, 0), B = c(0, 1, 1, 0, 1, 1, 0, 1, 1, 
    0)), class = "data.frame", row.names = c(NA, -10L))