I want to delete the rows with duplicated ids
data
id V1 V2
1 a 1
1 b 2
2 a 2
2 c 3
3 a 4
The problem is that some people did the test for a few times, which generate multiple scores on V2, I want to delete the duplicated id and retain one of the scores in V2 randomly.
output
id V1 V2
1 a 1
2 a 2
3 a 4
I tried this:
neu <- unique(neu$userid)
but it didn't work
Using dplyr
:
library(dplyr)
set.seed(1)
df %>% sample_frac(., 1) %>% arrange(id) %>% distinct(id)
Output:
id V1 V2
1 1 b 2
2 2 c 3
3 3 a 4
Data:
df <- structure(list(id = c(1L, 1L, 2L, 2L, 3L), V1 = structure(c(1L,
2L, 1L, 3L, 1L), .Label = c("a", "b", "c"), class = "factor"),
V2 = c(1L, 2L, 2L, 3L, 4L)), .Names = c("id", "V1", "V2"), class = "data.frame", row.names = c(NA,
-5L))