R dedupe records that are not exactly duplicates

I have a list of record that I need to dedup, these look like a combination of the same set of, but using the regular functions to deduplicate records does not work because the two columns are not duplicates. Below is a reproducible example.

df <- data.frame( A  =  c("2","2","2","43","43","43","331","391","481","490","501","501","501","502","502","502"),

          B =  c("43","501","502","2","501","502","491","496","490","481","2","43","502","2","43","501"))

Below is the desired output that I'm looking for.

df_Final <- data.frame( A  =  c("2","2","2","331","391","481"),

          B =  c("43","501","502","491","496","490"))

Solution

I guess the idea is that you want to find when the elements in column A first appear in column B

idx = match(df$A, df$B)

and keep the row if the element in A isn't in B (is.na(idx)) or the element in A occurs before it's first occurrence in B (seq_along(idx) < idx)

df[is.na(idx) | seq_along(idx) < idx,]

Maybe a more-or-less literal tidyverse approach to this would be to create and then drop a temporary column

library(tidyverse)
df %>% mutate(idx = match(A, B)) %>%
    filter(is.na(idx) | seq_along(idx) < idx) %>%
    select(-idx)