I am trying to filter my data and remove IDs that have fewer than 9 unique month observations. I would also like to create a list of IDs that includes the count.
I've tried using a few different options:
library(dplyr)
count <- bind %>% group_by(IDS) %>% filter(n(data.month)>= 9) %>% ungroup()
count2 <- subset(bind, with(bind, IDS %in% names(which(table(data.month)>=9))))
Neither of these worked.
This is what my data looks like:
data.month ID
01 2
02 2
03 2
04 2
05 2
05 2
06 2
06 2
07 2
07 2
07 2
07 2
07 2
08 2
09 2
10 2
11 2
12 2
01 5
01 5
02 5
01 7
01 7
01 7
01 4
02 4
03 4
04 4
05 4
05 4
06 4
06 4
07 4
07 4
07 4
07 4
07 4
08 4
09 4
10 4
11 4
12 4
In the end, I would like a this:
IDs
2
3
I would also like this
IDs Count
2 12
5 2
7 1
4 12
So far this code is the closest, but still just gives error codes:
count <- bind %>%
group_by(IDs) %>%
filter(length(unique(bind$data.month >=9)))
Error in filter_impl(.data, quo) : Argument 2 filter condition does not evaluate to a logical vector
We can use n_distinct
To remove ID
s with less than 9 unique observations
library(dplyr)
df %>%
group_by(ID) %>%
filter(n_distinct(data.month) >= 9) %>%
pull(ID) %>% unique
#[1] 2 4
Or
df %>%
group_by(ID) %>%
filter(n_distinct(data.month) >= 9) %>%
distinct(ID)
# ID
# <int>
#1 2
#2 4
For unique counts of each ID
df %>%
group_by(ID) %>%
summarise(count = n_distinct(data.month))
# ID count
# <int> <int>
#1 2 12
#2 4 12
#3 5 2
#4 7 1