This question is to build deeper understanding of R function Across & Which . I ran this code & got the message. I want to understand
a) what is the difference between good & bad pratice
b) How does where function work exactly in general & in this use case
library(tidyverse)
iris %>% mutate(across(is.character,as.factor)) %>% str()
Warning message:
Problem with `mutate()` input `..1`.
i Predicate functions must be wrapped in `where()`.
# Bad
data %>% select(is.character)
# Good
data %>% select(where(is.character))
i Please update your code.
There is not much difference between using where
and not using it. It just shows a warning to suggest a better syntax. Basically where
takes a predicate function and apply it on every variable (column) of your data set. It then returns every variable for which the function returns TRUE
. The following examples are taken from the documentations of where
:
iris %>% select(where(is.numeric))
# or an anonymous function
iris %>% select(where(function(x) is.numeric(x)))
# or a purrr style formula as a shortcut for creating a function on the spot
iris %>% select(where(~ is.numeric(.x)))
Or you can also have two conditions using shorthand &&
:
# The following code selects are numeric variables whose means are greater thatn 3.5
iris %>% select(where(~ is.numeric(.x) && mean(.x) > 3.5))
You can use select(where(is.character))
for .cols
argument of the across
function and then apply a function in .fns
argument on the selected columns.
For more information you can always refer to documentations which are the best source to learn more about these materials.