I have this example df
df = data.frame(id = c('1E','1F','2E','2F','3F','4F'),
statut = c('A','A','B','B','A','A') )
For each id that has the number repeated, I wish that the statut becomes NA when the letter of the repeated id is E
I am looking to have the following output:
id statut
1 1E <NA>
2 1F A
3 2E <NA>
4 2F B
5 3F A
6 4F A
Here's a set of conditions that match your expected output:
idx <- substr(df$id, 1, 1) == dplyr::lead(substr(df$id, 1, 1)) & substr(df$id, 2, 2) == "E"
df$statut[idx] <- NA
id statut
1 1E <NA>
2 1F A
3 2E <NA>
4 2F B
5 3F A
6 4F A
Or in a more dplyr friendly pipeline:
library(tidyr)
library(dplyr)
df %>%
separate_wider_regex(id, c(number = "\\d+", letter = ".*")) %>%
group_by(number) %>%
mutate(statut = ifelse(n() > 1 & letter == "E", NA, statut))