I am trying to provide two functions inside the mutate(across(where(is.factor))) to order the factor levels and drop unused levels. The code appears not to be working as expected. Where might have gone wrong?
#---- Libraries ----
library(tidyverse)
#---- Data ----
set.seed(2021)
df <- tibble(
a1 = factor(ifelse(sign(rnorm(30))==-1, 0, 1), labels = c("No", "Yes")),
a2 = factor(ifelse(sign(rnorm(30))==-1, 0, 1), labels = c("No", "Yes")),
gender = gl(2, 15, labels = c("Males", "Females")),
b2 = gl(3, 10, labels = c("Primary", "Secondary", "Tertiary", "Unknown")),
c1 = gl(3, 10, labels = c("15-19", "20-24", "25-30", "30-35")),
outcome = factor(ifelse(sign(rnorm(30))==-1, 0, 1), labels = c("No", "Yes")),
weight = runif(30, 1, 12)
)
#---- Problem ----
df <- df %>%
mutate(across(where(is.factor), list(fct_infreq, fct_drop)))
levels(df$b2)
# The unused levels not dropped
The issue is that you are actually mutating two new columns here, so you will see in your resulting dataframe that there are two columns b2_1
and b2_2
, each corresponding to applying the two functions.
If you run levels(df$b2_2)
you'll see your desired output.
If your aim is to first drop and then reorder then you need to run consecutive mutates:
df <- df %>%
mutate(across(where(is.factor), fct_drop)) %>%
mutate(across(where(is.factor), fct_infreq))
or run nested functions in your mutate
df <- df %>%
mutate(across(where(is.factor), ~fct_infreq(fct_drop(.x))))