Search code examples
rselectdplyr

Create function which selects columns using column name and a prefix in r


I want to create a function that selects two columns, a column based on the input to the function and a second column that has a prefix of "other_" to the first column thats being selected (in the example that is "Species" and "other_Species". Without adding another input to the function.

library(tidyverse)
iris_df <- iris %>%
  mutate(other_Species = Species)

iris_fn <- function (df, col){
df%>%
select(all_of(col, paste0("other_",col)))
}

iris_fn(df=iris_df, col="Species")

Solution

  • Transform "other_Species" in a symbol first. And all_of accepts only one argument at a time.
    Another option, like it is said in comment, is to combine both character vectors with c().

    What do you mean by all_of() only accepts one string at a time? It takes a character vector so you can use select(all_of(c(col, paste0("other_", col)))) in the function, no need to convert to a symbol if you're passing a string.

    suppressPackageStartupMessages(
      library(tidyverse)
    )
    iris_df <- iris %>%
      mutate(other_Species = Species)
    
    iris_fn <- function (df, col){
      other_col <- sym(paste0("other_", col))
      df%>%
        select(all_of(col), all_of(other_col))
    }
    
    iris_fn(df = iris_df, col = "Species") %>% head()
    #>   Species other_Species
    #> 1  setosa        setosa
    #> 2  setosa        setosa
    #> 3  setosa        setosa
    #> 4  setosa        setosa
    #> 5  setosa        setosa
    #> 6  setosa        setosa
    

    Created on 2024-01-25 with reprex v2.0.2