ID | score | Time |
---|---|---|
1 | 1000000 | 1 |
2 | 1000000 | 1 |
2 | 1000000 | 2 |
3 | 1000000 | 1 |
3 | 1000000 | 2 |
4 | 1000000 | 1 |
5 | 1000000 | 1 |
5 | 1000000 | 2 |
How do I subset this database to a new data base including only ID's with at least 2 follow-ups? Follow-up is resembled as "Time", whereas people with only Time 1 had one follow-up and I want to exclude them from the database. Thanks.
We can either use a frequency based on approach to filter
the 'ID's having more than one observation after grouping by 'ID'
library(dplyr)
df1 %>% group_by(ID) %>%
filter(n() > 1) %>%
ungroup()
Or in base R
- also use subset
to subset the rows where it checks for 'ID's that have Time
value greater than 1
subset(df1, ID %in% ID[Time > 1])
df1 <- structure(list(ID = c(1L, 2L, 2L, 3L, 3L, 4L, 5L, 5L),
score = c(1000000L, 1000000L, 1000000L, 1000000L, 1000000L, 1000000L, 1000000L, 1000000L),
Time = c(1L, 1L, 2L, 1L, 2L, 1L, 1L, 2L)),
class = "data.frame", row.names = c(NA, -8L))