I'm trying to create a function using mutate
and case_match
from the dplyr
package as part of my analytical worfklow. However, to fully automate the process, I want to include an additional dataframe (as an argument in the function) that contains the text string pairs that will be changed if found within the dataframe that contains the data.
Without the for loop, this works perfectly:
dftest <- data.frame(old = c("ones","twos","fours","fives"), new = c("Humanoid",
"Hairy","Hairy","what"))
test1 <- data.frame(spp= c("ones", "twos", "threes"), log = c(5,61,36))
updnames <- function(df, col, names_df) {
require(dplyr)
if(ncol(names_df)>2)
{stop("More than 2 columns in names dataframe")}
if(sum(duplicated(names_df[1]))>0)
{stop("Duplicate old species names")}
else
{
names_df <- names_df %>% mutate_all(as.character)
df <- df %>%
mutate(updnames = case_match({{col}},
names_df[1,1] ~ names_df[1,2],
names_df[2,1] ~ names_df[2,2],
names_df[3,1] ~ names_df[3,2],
names_df[4,1] ~ names_df[4,2],
.default = {{col}}))}
return(df)
}
test2 <- updnames(test1, spp,dftest)
> test2 # Correct output
spp log updnames
1 ones 5 Humanoid
2 twos 61 Hairy
3 threes 36 threes
Adding the for loop does not work though. The new column is created as expected, but the column values are simply duplicated:
updnames <- function(df, col, names_df) {
require(dplyr)
if(ncol(names_df)>2)
{stop("More than 2 columns in names dataframe")}
if(sum(duplicated(names_df[1]))>0)
{stop("Duplicate old species names")}
else
{
names_df <- names_df %>% mutate_all(as.character)
for(i in 1:nrow(names_df)){
df <- df %>%
mutate(updnames = case_match({{col}},
names_df[i,1] ~ names_df[i,2],
.default = {{col}}))}
}
return(df)
}
test2 <- updnames(test1, spp, dftest)
> test2 # Wrong output
spp log updnames
1 ones 5 ones
2 twos 61 twos
3 threes 36 threes
I tried looking various other posts on Stack Overflow and reading relevant documentation, but I don't seem to be able to figure it out.
If anyone has any ideas, or alternative solutions for what I'm trying to achieve, that would be greatly appreciated.
Use recode
.
test1 %>%
mutate(new_spp = recode(spp, !!!deframe(dftest)))
spp log new_spp
1 ones 5 Humanoid
2 twos 61 Hairy
3 threes 36 threes
In function format do:
update_names <- function(df, col, new_names){
df %>% mutate('{{col}}_new' := recode({{col}},!!!deframe(new_names)))
}
update_names(test1,spp, dftest)
spp log spp_new
1 ones 5 Humanoid
2 twos 61 Hairy
3 threes 36 threes