I know this should be straightforward, but it always bites me.
Suppose I have a factor:
library(dplyr)
library(forcats)
fruits <- as.factor(c("apples", "oranges", "oranges", "pears", "pears", "pears"))
df <- as.data.frame(fruits)
I want to reorder the factors according to their frequency (or some other statistics) so that pears>oranges>apples. How do I do that without explicitly calling df %>% group_by(fruits) %>% summarise(freq = n()) %>% fct_reorder(fruits, freq, .desc = TRUE)
?
We may need to use that in mutate
.
library(dplyr)
library(forcats)
out <- df %>%
group_by(fruits) %>%
summarise(freq = n(), .groups = 'drop') %>%
mutate(fruits = fct_reorder(fruits, freq, .desc = TRUE))
-checking the order of levels
levels(out$fruits)
[1] "pears" "oranges" "apples"
levels(df$fruits)
[1] "apples" "oranges" "pears"
If we want to do this on the original dataset, instead of summarise
, use add_count
to create a frequency column, and apply fct_reorder
df <- df %>%
add_count(fruits) %>%
mutate(fruits = fct_reorder(fruits, n, .desc = TRUE)) %>%
select(-n)
NOTE: group_by
in 1.0.6
- dplyr
version doesn't have a .desc
argument. The .desc
is found in fct_reorder
In base R
, we can do this with table
out1 <- table(fruits)
factor(fruits, levels = names(out1[order(-out1)]))
[1] apples oranges oranges pears pears pears
Levels: pears oranges apples