Search code examples
rdplyr

filter if sum of rows larger than value


df <- data.frame(x = c(1,2,3,4), y = c(1,2,3,4), z = c("A","B","A","B"))

I'm trying to use dplyr's filter function to filter to rows whose sum is greater than 4. I tried:

df %>% filter_at(vars(c(x,y)), any_vars(rowSums(.) > 4))

Error in `filter()`:
ℹ In argument: `rowSums(x) > 4 | rowSums(y) > 4`.
Caused by error in `rowSums()`:
! 'x' must be an array of at least two dimensions
Run `rlang::last_trace()` to see where the error occurred.

Desired output:

  x y z
  3 3 A
  4 4 B

Solution

  • Use pick() inside rowSums():

    library(dplyr)
    
    df %>% 
      filter(rowSums(pick(x:y)) > 4)
    #   x y z
    # 1 3 3 A
    # 2 4 4 B
    

    (Also note that scoped verbs like filter_at() have been superseded by across() and pick().)