Remove rows with conditions using dplyr

I'd like to remove rows in my data frame that looks like

df <- data.frame(col1 = c("a", "a", "m", "m", "m", "m", "n", "q"),
                 col2 = c("a", "b", "m", "x", "y", "z", NA, "p"))
  col1 col2
1    a    a
2    a    b
3    m    m
4    m    x
5    m    y
6    m    z
7    n <NA>
8    q    p

I'm only focusing on a and m in Col1 because those values appear in Col2. I would like to remove rows where Col1 and Col2 don't have matching values. Note: Given that the provided df is just a reproducible example for my huge dataset, specifying individual values like 'a' or 'm' wouldn't be suitable.

My desired outcome

  col1 col2
1    a    a
2    m    m
3    n   <NA>
4    q    p

Any suggestions? Thanks a lot for your help!

Solution

You can try this

df %>%
    filter(col1 == col2 | !col1 %in% intersect(col1, col2))

which gives

  col1 col2
1    a    a
2    m    m
3    n <NA>
4    q    p

Find the first row in a data frame that satisfies a condition and delete everything above?
Any other options besides the traditional CLD bar graph?
R data.table update join by reference the, but updating the RIGHT table
Problems with installation R packages
R correlation: I'm getting inconsistent correlation results with cor() function
Convert a matrix in R into a upper triangular/lower triangular matrix with those corresponding entries
Printing text in ggplot
Making Replicable Layout Matrices for R Plots
Reading in a data file with staggered column names into R
Start a PowerShell script in R via system2()
Issues integrating shinychat into a modular R Shiny app
Web scraping on tipti page that requires login
barplot multiple aggregation
plot from sankeyNetwork in networkD3 does not show output (issue is not number of unique nodes)
"Target position can only be set for new windows" in chromote in R
Extract the correct data type in a PDF table
Time conversion in R
Comparing the values of a certain number previous rows with the current row
Run a single test function in R's testthat
rpart package installation in R
An efficient way to assign value based on a min-max range and category
Change output of the `purrr::map` function
osmdata_sf returns failed to perform HTTP request curl::curl_fetch_memory() error in R?
Comparing nls() to nls2() - what am I doing wrong
How to add "variables grid" below ggplot
How can I use predefined code snippets outside of code chunks in Quarto within RStudio/Posit?
Wrap text for collapse rows in KableExtra for a long table in R
Implementation of Breusch-Pagan test for random effects in plm with unbalanced panels
Finding a value of a dataset in different ones
Replicate matrix