Search code examples
rdataframeduplicatesdelete-rowcorresponding-records

Delete rows that exist in another data frame?


I have the two following data frames (example):

df1:

name    profile    type    strand
A       4.5        1       +
B       3.2        1       +
C       5.5        1       +
D       14.0       1       -
E       45.1       1       -
F       32.8       1       -
G       19.9       1       +

df2:

name
A
B
C
G

I would like to delete the rows in df1 for which df1$name = df2$name to get the following:

Output:

name    profile    type    strand
D       14.0       1       -
E       45.1       1       -
F       32.8       1       -

If anyone could tell me which piece of code to use it would be a lot of help, seemed simple at first but I've been messing it up since yesterday.


Solution

  • You need the %in% operator. So,

    df1[!(df1$name %in% df2$name),]
    

    should give you what you want.

    • df1$name %in% df2$name tests whether the values in df1$name are in df2$name
    • The ! operator reverses the result.