Search code examples
rdplyrtidyverse

Apply a custom function to any number of columns using dplyr


I'm following this example in the dplyr site to program a function that aplies to any number of user supplied columns.

Here is a reprex:

library(tidyverse)
library(palmerpenguins)

trim_whitespace <- function(.data, ...) {
  .data |> mutate(across(..., \(x) str_remove_all(x, "\\s")))
}

# Add whitespace at the end of chr columns
penguins_with_whitespace <- 
  penguins |>
  mutate(across(c(species, island), \(x) paste0(x, " ")))

penguins_with_whitespace |> trim_whitespace(species, island)

This causes the error: 'island' not found. Why is this function not working and what would be the way to fix it?


Solution

  • Basically across thinks you are feeding it island as an object rather than a column in the penguins dataset. The way around it is just to add c(...) or {{}} to the function like this

    library(tidyverse)
    library(palmerpenguins)
    
    ## what the function thinks it is doing 
    penguins |>
      mutate(across(species,island, \(x) str_remove_all(x, '\\s')))
    #> Error in `mutate()`:
    #> ℹ In argument: `across(species, island, function(x) str_remove_all(x,
    #>   "\\s"))`.
    #> Caused by error:
    #> ! object 'island' not found
    
    
    f1 <- function(.data, ... ) {
      .data |> mutate(across(c(...), \(x) str_remove_all(x, "\\s")))
    }
    
    
    f2  <- function(.data, cols ) {
      .data |> mutate(across({{cols}}, \(x) str_remove_all(x, "\\s")))
    }
    
    
    
    penguins |>
      f1(species, island) |>
      head(1)
    #> # A tibble: 1 × 8
    #>   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
    #>   <chr>   <chr>              <dbl>         <dbl>             <int>       <int>
    #> 1 Adelie  Torgersen           39.1          18.7               181        3750
    #> # ℹ 2 more variables: sex <fct>, year <int>
    
    penguins |>
      f2(c(species, island)) |>
      head(1)
    #> # A tibble: 1 × 8
    #>   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
    #>   <chr>   <chr>              <dbl>         <dbl>             <int>       <int>
    #> 1 Adelie  Torgersen           39.1          18.7               181        3750
    #> # ℹ 2 more variables: sex <fct>, year <int>
    

    Created on 2024-12-02 with reprex v2.1.1