I have the following sample:
Id = c(1, 1, 2, 2, 2, 1, 4, 3, 3, 3)
long = c("60.466681", "60.664116", "60.766690", "60.86879", "60.986569","60.466681", "60.664116", "60.766690", "60.86879", "60.986569" )
data = data.frame(Id, long)
I would like to remove the lines where the level of the factor Id
occurs only one time in the data.frame. For example here, I would remove the row with Id == 4
and keep the others.
I tried:
data$duplicated <- duplicated(data$Id)
subset(data, data$duplicated == "FALSE")
but this also removes the row when each factor occurs for the first
time (i.e. the first rows with Id=1
or Id=2
)
Id long duplicated
1 1 60.466681 FALSE
2 1 60.664116 TRUE
3 2 60.766690 FALSE
4 2 60.86879 TRUE
5 2 60.986569 TRUE
6 1 60.466681 TRUE
Is there an easy way to do this?
library(plyr)
data2<-ddply(data,.(Id),function(x){
if(nrow(x)==1){
return(NULL)
}
else{
return(x)
}
})
> data2
Id long
1 1 60.466681
2 1 60.664116
3 1 60.466681
4 2 60.766690
5 2 60.86879
6 2 60.986569
7 3 60.766690
8 3 60.86879
9 3 60.986569