I have this reach data frame with ordered values and Reachability and my desired output is a summary table of several properties grouped by Cluster. The entire table contains more values but I think 10 rows are more than enough to explain what I want to achieve.
# A tibble: 500 x 3
Order Reachability Cluster
<int> <dbl> <dbl>
1 1 NA 1
2 2 1.54 1
3 3 1.54 1
4 4 0.860 1
5 5 0.821 1
6 6 0.821 1
7 7 0.821 1
8 8 0.821 1
9 9 0.821 1
10 10 0.821 1
# ... with 490 more rows
I create my summary table with some position information about my reach table.
reach %>% dplyr::group_by(Cluster) %>%
summarise(first_value = first(na.omit(Reachability)),
min_value = min(na.omit(Reachability)),
last_value = last(na.omit(Reachability)),
first_pos = first(Order),
min_pos = Order[which.min(Reachability)],
last_pos = last(Order))
# A tibble: 1 x 7
Cluster first_value min_value last_value first_pos min_pos last_pos
<dbl> <dbl> <dbl> <dbl> <int> <int> <int>
1 1 1.54 0.821 0.821 1 5 10
What I'm having trouble with is a command inside summarise that allows me to count the number of times that "min_value" repeats. In this case, for 0.821
the "min_value" should be 6. This is what I've tried with no success:
... %>%
N_min = sum(Reachability == min(na.omit(Reachability))))
... %>%
N_min = count(min(na.omit(Reachability))))
Am I missing something? I really have no idea why does my first option not work. From what I understand if I make that sum, performed by groups, should give me a sum of TRUE's (or 1's) that meet my condition. Thanks!
reach <- structure(list(Order = 1:10, Reachability = c(NA, 1.53995982068778,
1.53995982068778, 0.860332791733694, 0.820585921380499, 0.820585921380499,
0.820585921380499, 0.820585921380499, 0.820585921380499, 0.820585921380499
), Cluster = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
Your first option should ideally work but again floating point comparisons are not accurate. (Ref Why are these numbers not equal?)
Try rounding the numbers before using sum
N_min = sum(round(Reachability, 2) == round(min(Reachability,na.rm = TRUE), 2))