How to extract only three observations that are top observations with respect to some variable, ex. count (n
var in example data below)? I would like to avoid arranging rows so I thought I could use dplyr::min_rank
.
ex <- structure(list(code = c("17.1", "6.2", "151.5", "78.1", "88.1",
"95.1", "45.2", "252.2"), id = c(1, 2, 3, 4, 5, 6, 7, 8), n = c(6L,
5L, 8L, 10L, 6L, 3L, 4L, 6L)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -8L))
ex %>%
filter(min_rank(desc(n)) <= 3)
But if there are ties, it can give more than 3 observations. For example, the command above returns five rows:
# A tibble: 5 x 3
code id n
<chr> <dbl> <int>
1 17.1 1 6
2 151.5 3 8
3 78.1 4 10
4 88.1 5 6
5 252.2 8 6
How can I then extract exactly 3 observations? (no matter which observation is returned in case of ties)
We can use row_number
that can take a column as argument
ex %>%
filter(row_number(desc(n)) <= 3)
# A tibble: 3 x 3
# code id n
# <chr> <dbl> <int>
#1 17.1 1 6
#2 151.5 3 8
#3 78.1 4 10