Search code examples
rforcats

How to reorder a factor based on a subset (facets) of another variable, using forcats?


forcats vignette states that

The goal of the forcats package is to provide a suite of useful tools that solve common problems with factors

And indeed one of the tools is to reorder factors by another variable, which is a very common use case in plotting data. I was trying to use forcats to accomplish this, but in the case of a faceted plot. That is, I want to reorder a factor by other variable, but using only a subset of the data. Here's a reprex:

library(tidyverse)

ggplot2::diamonds %>% 
    group_by(cut, clarity) %>% 
    summarise(value = mean(table, na.rm = TRUE)) %>%
    ggplot(aes(x = clarity, y = value, color = clarity)) + 
    geom_segment(aes(xend = clarity, y = min(value), yend = value), 
                 size = 1.5, alpha = 0.5) + 
    geom_point(size = 3) + 
    facet_grid(rows = "cut", scales = "free") +
    coord_flip() +
    theme(legend.position = "none")

This code produces the plot close to what I want:

enter image description here

But I want the clarity axis to be sorted by value, so I can quickly spot which clarity has the highest value. But then each facet would imply a different order. So I'd like to choose to order the plot by the values within a specific facet.

The straightforward use of forcats, of course, does not work in this case, 'cause it would reorder the factor based on all the values, and not only the values of a specific facet. Let's do it:

# Inserting this line right before the ggplot call
mutate(clarity = forcats::fct_reorder(clarity, value)) %>%

It then produces this plot. enter image description here

Of course, it reordered the factor based on the whole data, but what if I want the plot ordered by the values of the "Ideal" cut?, How can I do this with forcats?

My current solution would be as follows:

ggdf <- ggplot2::diamonds %>% 
    group_by(cut, clarity) %>% 
    summarise(value = mean(table, na.rm = TRUE))

# The trick would be to create an auxiliary factor using only
# the subset of the data I want, and then use the levels
# to reorder the factor in the entire dataset.
#
# Note that I use good-old reorder, and not the forcats version
# which I could have, but better this way to emphasize that
# so far I haven't found the advantage of using forcats 
reordered_factor <- reorder(ggdf$clarity[ggdf$cut == "Ideal"], 
                            ggdf$value[ggdf$cut == "Ideal"])

ggdf$clarity <- factor(ggdf$clarity, levels = levels(reordered_factor))

ggdf %>%
    ggplot(aes(x = clarity, y = value, color = clarity)) + 
    geom_segment(aes(xend = clarity, y = min(value), yend = value), 
                 size = 1.5, alpha = 0.5) + 
    geom_point(size = 3) + 
    facet_grid(rows = "cut", scales = "free") +
    coord_flip() +
    theme(legend.position = "none")

Which produces what I want.

enter image description here

But I wonder if there is a more elegant/clever way to do it using forcats.


Solution

  • If you want to reorder clarity by the values of a particular facet you have to tell forcats::fct_reorder() to do so, e.g.,

    mutate(clarity = forcats::fct_reorder(
        clarity, filter(., cut == "Ideal") %>% pull(value)))
    

    which uses only the values for the "Ideal" facet for reordering.

    Thus,

    ggplot2::diamonds %>% 
      group_by(cut, clarity) %>% 
      summarise(value = mean(table, na.rm = TRUE)) %>%
      mutate(clarity = forcats::fct_reorder(
        clarity, filter(., cut == "Ideal") %>% pull(value))) %>%
      ggplot(aes(x = clarity, y = value, color = clarity)) + 
      geom_segment(aes(xend = clarity, y = min(value), yend = value), 
                   size = 1.5, alpha = 0.5) + 
      geom_point(size = 3) + 
      facet_grid(rows = "cut", scales = "free") +
      coord_flip() +
      theme(legend.position = "none")
    

    creates

    enter image description here

    as requested.