The following data frame is grouped by the id variable. For each id on variables X, Y, and Z, I wish to replace "no" with "yes" on the first row if and only if the specific id has "yes" in row(s) other than the first row.
id <- c(1,1,1,2,2,3,3)
X <- c("yes", "no", "no", "no", "no", "no", "no")
Y <- c("no", "no", "yes", "no", "yes", "no", "no")
Z <- c("no", "yes", "no", "no", "no", "no", "no")
df <- data.frame(id, X, Y, Z)
The expected is:
id X Y Z
1 yes yes yes
1 no no no
1 no no no
2 no yes no
2 no no no
3 no no no
3 no no no
I tried using the ifelse function, but encountered difficulties due the groupings. I would like to request help here. Thank you!
Here is a dplyr
solution using a case_when
:
We check each group of rows sharing the same id
:
If any row within that group has yes
, then the first row of the group is changed to yes
.
For all subsequent rows of the group, any yes
is flipped to no
.
All other values remain unchanged.
library(dplyr)
df %>%
mutate(
across(X:Z, ~ case_when(
row_number() == 1 & any(. == "yes") ~ "yes",
row_number() > 1 & . == "yes" ~ "no",
.default = .)), .by = id)
id X Y Z
1 1 yes yes yes
2 1 no no no
3 1 no no no
4 2 no yes no
5 2 no no no
6 3 no no no
7 3 no no no