Search code examples
rdataframesubset

How to delete specific rows conditionally for many variables in a simple way?


Let's say I have this dataframe (but imagine it with hundreds of variables x, y, etc.).

df = data.frame ( x = c(1,2,3,4,5), y = c(1,2,3,4,5))

and I wish to delete the rows that contain either 1 or 5 in any variable.

I am familiar with the following algorithm:

df[!(df$x==1|df$x==5|df$y==1|df$y==5),]

But I am looking for a small function that can handle hundreds of variables at the same time.


Solution

  • You could use the following code:

    df = data.frame ( x = c(1,2,3,4,5), y = c(1,2,3,4,5))
    df[rowSums(df==1|df==5)==0,]
    #>   x y
    #> 2 2 2
    #> 3 3 3
    #> 4 4 4
    

    Created on 2022-10-07 with reprex v2.0.2


    df = data.frame ( x = c(1,2,3,4,5), y = c(1,2,3,4,5))
    df[rowSums(df[-1]==1|df[-1]==5)==0,]
    #>   x y
    #> 2 2 2
    #> 3 3 3
    #> 4 4 4
    

    Created on 2022-10-07 with reprex v2.0.2