I have a dataset with features of type character (not all are binary and one of them represents a region).
In order to avoid having to use the function several times, I was trying to use a pipeline and across() to identify all of the columns of character type and encode them with the function created.
encode_ordinal <- function(x, order = unique(x)) {
x <- as.numeric(factor(x, levels = order, exclude = NULL))
x
}
dataset <- dataset %>%
encode_ordinal(across(where(is.character)))
However, it seems that I am not using across() correctly as I get the error:
Error: across()
must only be used inside dplyr verbs.
I wonder if I am overcomplicating myself and there is an easier way of achieving this, i.e., identifying all of the features of character type and encode them.
You should call across
and encode_ordinal
inside mutate
, as illustrated in the following example:
dataset <- tibble(x = 1:3, y = c('a', 'b', 'b'), z = c('A', 'A', 'B'))
# # A tibble: 3 x 3
# x y z
# <int> <chr> <chr>
# 1 1 a A
# 2 2 b A
# 3 3 b B
dataset %>%
mutate(across(where(is.character), encode_ordinal))
# # A tibble: 3 x 3
# x y z
# <int> <dbl> <dbl>
# 1 1 1 1
# 2 2 2 1
# 3 3 2 2