Search code examples
rmatchdata-manipulation

How to remove a row from a dataframe if a value does not exist in another dataframe in R?


I have two dataframes df_description and df_users. First one has a column named users. The second dataframe has two columns, user_1 and user_2. Ideally, values in both columns of df_users must match the values in users column of df_description.

If a value does not exist in df_description, the corresponding row of that value should be removed from df_users.

Following is the example:

#df_descirption
users
Adam
Micheal
George

And

#df_users
user_1   user_2
Adam     George
Adam     Micheal
George   Elizabeth #since Elizabeth does not exist in df_descirption, this row should be removed

The final df_users should look something like this:

#df_users
user_1   user_2
Adam     George
Adam     Micheal

Solution

  • In Base R, try:

    df_users[df_users$user_1 %in% df_descirption$users &
               df_users$user_2 %in% df_descirption$users, ]
    

    Output:

    #   user_1  user_2
    #1    Adam  George
    #2    Adam Micheal