I have a data:
df <- data.frame(strain = 1:6, sample = c("a24", "a24", "a24", "a26", "a26", "a27"), region = c(rep("ny", 3), rep("detroit",3)))
I want to count the number of sample per region and get something like:
region | sample_count |
---|---|
ny | 1 |
detroit | 2 |
I.e. ny has only one sample "a24", and detroit has two samples "a26" and "a27"
this way:
library(dplyr)
df |>
group_by(region) |>
summarise(sample_count = n_distinct(sample))
Output is:
# A tibble: 2 × 2
region sample_count
<chr> <int>
1 detroit 2
2 ny 1