Search code examples
rfilteringr-factor

Dropping factors which levels have observations smaller than a specific value-R


Let I have such data frame(df1) with factors:

factor1  factor2  factor3
-------  -------  -------
d        a         x
d        a         x
b        a         x
b        c         x
b        c         y
c        c         y
c        n         y
c        n         y
c        n         y

I want to drop factors from this data frame which one of elements have less than 3 observations.

In this data frame factor1 has 3 levels(d,b and c). However d level has frequency 2. So I want to drop factor1 from this data frame.

Resulted data frame should be as:

factor2  factor3
-------  -------
a         x
a         x
a         x
c         x
c         y
c         y
n         y
n         y
n         y

How can I do this using R? I will be very glad for any help. Thanks a lot.


Solution

  • You could try using lapply and table:

    df1[, lapply(c(1,2,3), FUN = function(x) min(table(df1[,x]))) >= 3]
    

    and, a little more generic:

    df1[, lapply(1:ncol(df1), FUN = function(x) min(table(df1[,x]))) >= 3]