I am trying to write a function that can be used within a dplyr pipeline. It should take an arbitrary number of columns as arguments, and replace certain substrings in only those columns. Below is a simple example of what I have so far.
library(tidyverse)
tib <- tibble(
x = c("cats and dogs", "foxes and hounds"),
y = c("whales and dolphins", "cats and foxes"),
z = c("dogs and geese", "cats and mice")
)
filter_words <- function(.data, ...) {
words_to_filter <- c("cat", "dog")
.data %>% mutate(
across(..., ~ gsub(
paste0(words_to_filter, collapse = "|"),
"#@!*", ., perl = TRUE
)
)
)
}
filtered_tib <- tib %>%
filter_words(x, y)
If this worked I would expect:
x y z
#@!*s and #@!*s whales and dolphins dogs and geese
foxes and hounds #@!*s and foxes cats and mice
But I get an error:
Error: Can't splice an object of type `closure` because it is not a vector
Run `rlang::last_error()` to see where the error occurred.
Called from: signal_abort(cnd)
I have tried numerous combinations of non-standard evaluation, as gleaned from the tidyverse docs and many questions on SO, and have seen almost as many different errors! Would anyone be able to help get this working? It does work if I replace the dots with everything()
, but that does not fit my use case to only filter certain columns.
Inside your function, across(...,
should instead be across(c(...),
.
library(dplyr, warn.conflicts = FALSE)
sessionInfo()$otherPkgs$dplyr$Version
#> [1] "1.0.7"
tib <- tibble(
x = c("cats and dogs", "foxes and hounds"),
y = c("whales and dolphins", "cats and foxes"),
z = c("dogs and geese", "cats and mice")
)
filter_words <- function(.data, ...) {
words_to_filter <- c("cat", "dog")
.data %>% mutate(
across(c(...), ~ gsub(
paste0(words_to_filter, collapse = "|"),
"#@!*", ., perl = TRUE
)
)
)
}
tib %>%
filter_words(x, y)
#> # A tibble: 2 × 3
#> x y z
#> <chr> <chr> <chr>
#> 1 #@!*s and #@!*s whales and dolphins dogs and geese
#> 2 foxes and hounds #@!*s and foxes cats and mice
Created on 2022-01-17 by the reprex package (v2.0.1)