This question arose, while working on this question Replace list names if they exist
I have this manipulated iris dataset with two vectors:
new_name <- c("new_setoas", "new_virginica")
to_select <- c("setosa", "virginica")
iris %>%
group_by(Species) %>%
slice(1:2) %>%
mutate(Species = as.character(Species))
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <chr>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 7 3.2 4.7 1.4 versicolor
4 6.4 3.2 4.5 1.5 versicolor
5 6.3 3.3 6 2.5 virginica
6 5.8 2.7 5.1 1.9 virginica
I would like to replace values in Species selected from a vector (to_select
) with values from another vector (new_name
)
When I do:
new_name <- c("new_setoas", "new_virginica")
to_select <- c("setosa", "virginica")
iris %>%
group_by(Species) %>%
slice(1:2) %>%
mutate(Species = as.character(Species)) %>%
mutate(Species = ifelse(Species %in% to_select, new_name, Species))
# I get:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <chr>
1 5.1 3.5 1.4 0.2 new_setoas
2 4.9 3 1.4 0.2 **new_virginica** # should be new_setoas
3 7 3.2 4.7 1.4 versicolor
4 6.4 3.2 4.5 1.5 versicolor
5 6.3 3.3 6 2.5 **new_setoas** # should be new_virginica
6 5.8 2.7 5.1 1.9 new_virginica
While I know this is happening because of recycling. I don't know how to avoid this!
We may use recode
- instead of grouping and then modifying the group column afterwards, it can be done at the group_by
step itself
library(dplyr)
iris %>%
group_by(Species = recode(as.character(Species),
!!!setNames(new_name, to_select))) %>%
slice(1:2)
-output
# A tibble: 6 × 5
# Groups: Species [3]
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <chr>
1 5.1 3.5 1.4 0.2 new_setoas
2 4.9 3 1.4 0.2 new_setoas
3 7 3.2 4.7 1.4 versicolor
4 6.4 3.2 4.5 1.5 versicolor
5 6.3 3.3 6 2.5 new_virginica
6 5.8 2.7 5.1 1.9 new_virginica