I got a data frame in R like the following:
V1 V2 V3
1 2 3
1 43 54
2 34 53
3 34 51
3 43 42
...
And I want to delete all rows which value of V1 has a frequency lower then 2. So in my example the row with V1 = 2 should be deleted, because the value "2" only appears once in the column ("1" and "3" appear twice each).
I tired to add a extra column with the frequency of V1 in it to delete all rows where the frequency is > 1 but with the following I only get NAs in the extra column.
data$Frequency <- table(data$V1)[data$V1]
Thanks
You can also consider using data.table. We first count the occurence of each value in V1, then filter on those occurences being more than 1. Finally, we remove our count-column as we no longer need it.
library(data.table)
setDT(dat)
dat2 <- dat[,n:=.N,V1][n>1,,][,n:=NULL]
Or even quicker, thanks to RichardScriven:
dat[, .I[.N >= 2], by = V1]
> dat2
V1 V2 V3
1: 1 2 3
2: 1 43 54
3: 3 34 51
4: 3 43 42