Search code examples
rdataframedplyrplyr

pass column name to dplyr filter function from within a function


I'm writing a function to add some extra rows to a data frame i am building.

I've read many questions and answer's from previous OPs. But all answers, tips and tricks i found there do not work for me.

For this question i have the following test data frame:

tst <- data.frame("col 1" = c("a","a", "c"), "keyword test" = c("What", "Why", "how"), check.names = F)
> tst
  col 1 keyword test
1     a         What
2     a          Why
3     c          how

As you can see i have spaces in the data frame which i cannot remove, since the next tool is expecting spaces in the column names (DON'T ask me why!).

Now i want for example to filter all rows starting with "how" and replace "how" with "no_idea". This happens inside a temp DF. So that later on i can add the row "c" "no_idea" to the original data frame.

The function i wrote for this looks like this:

add_keyword <- function(df, filterColumn, filterValue,replacement){
  library(plyr)
  library(dplyr)
  temp_df <- dplyr::filter_(df, filterColumn == filterValue)
  temp_df$`Target keyword` <- gsub(as.character(filterValue), as.character(replacement), temp_df$`Target keyword`)
  df_out <- rbind(df, temp_df)
  return(df_out)
}

tst2 <- add_keyword(tst, "keyword test", "how", "no_idea")

Of course if I run the lines inside the function outside the function it works perfect.

The result i would like to have inside tst2

> tst2
  col 1 keyword test
1     a         What
2     a         Why
3     c         how
4     c         no_idea 

Solution

  • We can do this with interp from lazyeval:

    library(dplyr)
    library(lazyeval)
    
    add_keyword <- function(df, filterColumn, filterValue,replacement){
        temp_df <- df %>%
            filter_(interp(~ var == fval, var = as.name(filterColumn), fval = filterValue))
    
        temp_df[[filterColumn]] <- gsub(filterValue, replacement = replacement, temp_df[[filterColumn]])
        rbind(df, temp_df)
    }
    add_keyword(tst, "keyword test", "how", "no_idea")
    #   col 1 keyword test
    # 1     a         What
    # 2     a          Why
    # 3     c          how
    # 4     c      no_idea
    

    If we do not want to create additional rows, we can also try:

    add_keyword <- function(df, filterColumn, filterValue, replacement){
        df <- df %>%
            mutate_(
                .dots = setNames(
                    list(
                        interp(~ ifelse(startsWith(as.character(var), fval), rval, as.character(var)),
                               var = as.name(filterColumn), fval = filterValue, rval = replacement)),
                    filterColumn
                )
            )
        df
    }
    add_keyword(tst, "keyword test", "how", "no_idea")
    
    
    #   col 1 keyword test
    # 1     a         What
    # 2     a          Why
    # 3     c      no_idea