Search code examples
rdplyrgrouping

Conditional rowMeans with dplyr


I want to apply a conditional rowMeans to each group of IDs using dplyr. The reproducible table follows:

dat <- as.Date("2021/08/04")
len <- 4
seq(dat, by = "day", length.out = len)

input <- data.frame(
  date = c(seq(dat, by = "day", length.out = len) , seq(dat, by = "day", length.out = len)),
  user_id = c(rep("aa",4),rep("bb",4)),
  var1 = c(1:4),
  var2 = c(4,7,23,9,0,0,0,0),
  var3 = c(0,0,0,0,4,8,2,7) 
)

The logic is: For each ID, first select the columns which are numeric and their sum is =!0, and then compute the rowMean of the selected columns for that ID.

The desired output table follows:

 output = data.frame(
  date = c(seq(dat, by = "day", length.out = len) , seq(dat, by = "day", length.out = len)),
  user_id = c(rep("aa",4),rep("bb",4)),
  var1 = c(1:4),
  var2 = c(4,7,23,9,0,0,0,0),
  var3 = c(0,0,0,0,4,8,2,7),
  rowAverage = as.numeric(c(rowMeans(input[1:4,3:4]), rowMeans(input[5:8,c(3,5)])))
)

The lines that I came-up with follow, but I get some errors:

output = input %>%
    dplyr::group_by(user_id) %>%
    dplyr::mutate(rowAverage = rowMeans(select_if(function(x) {(is.numeric(x)) && (sum(x)=!0)})))

Could you please help me correct this? Thank you.


Solution

  • You’re very close. You just need to supply the data to select_if(), too. Use cur_data() for that.

    input %>%
      group_by(user_id) %>%
      mutate(rowAverage = rowMeans(
        select_if(cur_data(), \(x) is.numeric(x) && sum(x) != 0)
      ))
    #> # A tibble: 8 × 6
    #> # Groups:   user_id [2]
    #>   date       user_id  var1  var2  var3 rowAverage
    #>   <date>     <chr>   <int> <dbl> <dbl>      <dbl>
    #> 1 2021-08-04 aa          1     4     0        2.5
    #> 2 2021-08-05 aa          2     7     0        4.5
    #> 3 2021-08-06 aa          3    23     0       13  
    #> 4 2021-08-07 aa          4     9     0        6.5
    #> 5 2021-08-04 bb          1     0     4        2.5
    #> 6 2021-08-05 bb          2     0     8        5  
    #> 7 2021-08-06 bb          3     0     2        2.5
    #> 8 2021-08-07 bb          4     0     7        5.5