Search code examples
rdplyrradix

How to extract exactly three observations with biggest count


How to extract only three observations that are top observations with respect to some variable, ex. count (n var in example data below)? I would like to avoid arranging rows so I thought I could use dplyr::min_rank.

ex <- structure(list(code = c("17.1", "6.2", "151.5", "78.1", "88.1", 
"95.1", "45.2", "252.2"), id = c(1, 2, 3, 4, 5, 6, 7, 8), n = c(6L, 
5L, 8L, 10L, 6L, 3L, 4L, 6L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -8L))

ex %>% 
  filter(min_rank(desc(n)) <= 3)

But if there are ties, it can give more than 3 observations. For example, the command above returns five rows:

# A tibble: 5 x 3
  code     id     n
  <chr> <dbl> <int>
1 17.1      1     6
2 151.5     3     8
3 78.1      4    10
4 88.1      5     6
5 252.2     8     6

How can I then extract exactly 3 observations? (no matter which observation is returned in case of ties)


Solution

  • We can use row_number that can take a column as argument

    ex %>% 
      filter(row_number(desc(n)) <= 3)
    # A tibble: 3 x 3
    #   code     id     n
    #   <chr> <dbl> <int>
    #1 17.1      1     6
    #2 151.5     3     8
    #3 78.1      4    10