Search code examples
rfor-loopdplyrmutate

Combining a for loop and mutate's case_match within a function


I'm trying to create a function using mutate and case_match from the dplyr package as part of my analytical worfklow. However, to fully automate the process, I want to include an additional dataframe (as an argument in the function) that contains the text string pairs that will be changed if found within the dataframe that contains the data.

Without the for loop, this works perfectly:

dftest <- data.frame(old = c("ones","twos","fours","fives"), new = c("Humanoid",
"Hairy","Hairy","what"))
test1 <- data.frame(spp= c("ones", "twos", "threes"), log = c(5,61,36))

updnames <- function(df, col, names_df) {
    require(dplyr)
    if(ncol(names_df)>2)
    {stop("More than 2 columns in names dataframe")}
    if(sum(duplicated(names_df[1]))>0)
    {stop("Duplicate old species names")}
    else
    {
        names_df <- names_df %>% mutate_all(as.character)
        df <- df %>%
                mutate(updnames = case_match({{col}},
                    names_df[1,1] ~ names_df[1,2],
                    names_df[2,1] ~ names_df[2,2],
                    names_df[3,1] ~ names_df[3,2],
                    names_df[4,1] ~ names_df[4,2],
                    .default = {{col}}))}
    return(df)
}

test2 <- updnames(test1, spp,dftest)


> test2 # Correct output
     spp log updnames
1   ones   5 Humanoid
2   twos  61    Hairy
3 threes  36   threes

Adding the for loop does not work though. The new column is created as expected, but the column values are simply duplicated:

updnames <- function(df, col, names_df) {
  require(dplyr)
  if(ncol(names_df)>2)
  {stop("More than 2 columns in names dataframe")}
  if(sum(duplicated(names_df[1]))>0)
  {stop("Duplicate old species names")}
  else
  {
  names_df <- names_df %>% mutate_all(as.character)
  for(i in 1:nrow(names_df)){
    df <- df %>%
  mutate(updnames = case_match({{col}},
      names_df[i,1] ~ names_df[i,2],
      .default = {{col}}))}
  }
  return(df)
}

test2 <- updnames(test1, spp, dftest)

> test2 # Wrong output
     spp log updnames
1   ones   5     ones
2   twos  61     twos
3 threes  36   threes

I tried looking various other posts on Stack Overflow and reading relevant documentation, but I don't seem to be able to figure it out.

If anyone has any ideas, or alternative solutions for what I'm trying to achieve, that would be greatly appreciated.


Solution

  • Use recode.

    test1 %>% 
      mutate(new_spp = recode(spp, !!!deframe(dftest)))
    
         spp log  new_spp
    1   ones   5 Humanoid
    2   twos  61    Hairy
    3 threes  36   threes
    

    In function format do:

    update_names <- function(df, col, new_names){
         df %>% mutate('{{col}}_new' := recode({{col}},!!!deframe(new_names)))
     }
    update_names(test1,spp, dftest)
         spp log  spp_new
    1   ones   5 Humanoid
    2   twos  61    Hairy
    3 threes  36   threes