Search code examples
rdplyrtidyverseacross

Understand the warning message in across in R


This question is to build deeper understanding of R function Across & Which . I ran this code & got the message. I want to understand

a) what is the difference between good & bad pratice

b) How does where function work exactly in general & in this use case

library(tidyverse)
iris %>% mutate(across(is.character,as.factor)) %>% str()


Warning message:
Problem with `mutate()` input `..1`.
i Predicate functions must be wrapped in `where()`.

  # Bad
  data %>% select(is.character)

  # Good
  data %>% select(where(is.character))

i Please update your code.

Solution

  • There is not much difference between using where and not using it. It just shows a warning to suggest a better syntax. Basically where takes a predicate function and apply it on every variable (column) of your data set. It then returns every variable for which the function returns TRUE. The following examples are taken from the documentations of where:

    iris %>% select(where(is.numeric))
    # or an anonymous function
    iris %>% select(where(function(x) is.numeric(x)))
    # or a purrr style formula as a shortcut for creating a function on the spot
    iris %>% select(where(~ is.numeric(.x)))
    

    Or you can also have two conditions using shorthand &&:

    # The following code selects are numeric variables whose means are greater thatn 3.5
    iris %>% select(where(~ is.numeric(.x) && mean(.x) > 3.5))
    

    You can use select(where(is.character)) for .cols argument of the across function and then apply a function in .fns argument on the selected columns. For more information you can always refer to documentations which are the best source to learn more about these materials.