r dataframe conditional-statements multiple-columns na

How to turn specific values of a column into NAs based on the condition of a letter existing in another column?

I have this example df

df = data.frame(id = c('1E','1F','2E','2F','3F','4F'),
                statut = c('A','A','B','B','A','A') )

For each id that has the number repeated, I wish that the statut becomes NA when the letter of the repeated id is E

I am looking to have the following output:

  id statut
1 1E   <NA>
2 1F      A
3 2E   <NA>
4 2F      B
5 3F      A
6 4F      A

Solution

Here's a set of conditions that match your expected output:

idx <- substr(df$id, 1, 1) == dplyr::lead(substr(df$id, 1, 1)) & substr(df$id, 2, 2) == "E"
df$statut[idx] <- NA

  id statut
1 1E   <NA>
2 1F      A
3 2E   <NA>
4 2F      B
5 3F      A
6 4F      A

Or in a more dplyr friendly pipeline:

library(tidyr)
library(dplyr)
df %>%
  separate_wider_regex(id, c(number = "\\d+", letter = ".*")) %>% 
  group_by(number) %>% 
  mutate(statut = ifelse(n() > 1 & letter == "E", NA, statut))