I'd like to remove rows in my data frame that looks like
df <- data.frame(col1 = c("a", "a", "m", "m", "m", "m", "n", "q"),
col2 = c("a", "b", "m", "x", "y", "z", NA, "p"))
col1 col2
1 a a
2 a b
3 m m
4 m x
5 m y
6 m z
7 n <NA>
8 q p
I'm only focusing on a
and m
in Col1
because those values appear in Col2
. I would like to remove rows where Col1
and Col2
don't have matching values.
Note: Given that the provided df
is just a reproducible example for my huge dataset, specifying individual values like 'a' or 'm' wouldn't be suitable.
My desired outcome
col1 col2
1 a a
2 m m
3 n <NA>
4 q p
Any suggestions? Thanks a lot for your help!
You can try this
df %>%
filter(col1 == col2 | !col1 %in% intersect(col1, col2))
which gives
col1 col2
1 a a
2 m m
3 n <NA>
4 q p