Search code examples
rggplot2facet-wrap

Reformat label / preserve order of Multi-factor facets in ggplot2::facet_wrap() based on factor level


I wish to facet a graph based on two factors, rename the facets using a combination of the two facet factor values, but preserve the order of the facets based on the levels in the original factors.

The data looks something like this:

library(tidyverse)

set.seed(100)

tmp.d <- data.frame(
  sector = factor(rep(c("B","A"),c(6,3)), levels = c("B","A")),
  subsector = factor(rep(c("a","b","c"), each = 3), levels = c("c","b","a")),
  year = factor(rep(2020:2022,3)),
  value = sample(8:15,9, replace = TRUE)
)

#> tmp.d
#  sector subsector year value
#1      B         a 2020     9
#2      B         a 2021    14
#3      B         a 2022    13
#4      B         b 2020    15
#5      B         b 2021    10
#6      B         b 2022     8
#7      A         c 2020     9
#8      A         c 2021    13
#9      A         c 2022    11

Which is plotted and faceted by sector and subsector...

ggplot(tmp.d, aes(x = year, y = value, group = 1)) +
  geom_path()+
  facet_wrap(facets = list("sector","subsector"))

...and looks like this:

enter image description here

Notice that the facets keep the order set by the factor levels of "sector" and "subsector." This is desirable.

However, instead of listing the sector and sub sector on separate lines, I want the facet labels to read "[sector]: [subsector]" as in "B: b".

Attempt 1:

Adding a helper column to tmp.d, containing the facet labels.

tmp.d <- tmp.d %>% mutate(label = factor(paste0(sector, ": ", subsector)))

ggplot(tmp.d, aes(x = year, y = value, group = 1)) +
  geom_path()+
  facet_wrap(facets = list("label"))

Which yields:

enter image description here

Here, the facet labels are correct, but I've lost the order from the sector/subsector factor levels.

Attempt 2:

I think the answer may lay in a custom as_labeller function or perhaps even changing setting for an existing labeller like label_value which has a multi_line = [bool] attribute that controls whether the facet values appear on a single line or mulitple lines. Other versions of the label_ family have another attribute sep = which I beleive controls how the values are seperated in on the same line. Presumably, the combination of ...multi_line = FALSE, sep = ": "... might format the label and preserve the desired order.

The labeller is applied in the call to facet_wrap().

ggplot(tmp.d, aes(x = year, y = value, group = 1)) +
  geom_path()+
  facet_wrap(facets = list("sector","subsector"), labeller = [the labeller function])

Setting the labeller to an existing labeller function without changing default settings (see below) yields the same output as my original attempt above.

...
facet_wrap(facets = list("sector","subsector"), labeller = label_value)
...

Attempting to change the attribute values for label_value like so...

...
facet_wrap(facets = list("sector","subsector"), labeller = label_value(multi_line = FALSE))
...

... does not work because the label_value function requires a label value that I do not know how to provide. Passing the facet factors as names or as character strings (either as a list or vector) does not appear to work. The examples I found in the documentation or elsewhere use facet_grid instead of facet_wrap, and the labels is provided as a formula like ~sector+subsector which I assume is treated like a grid/matrix where sectors are columns and subsectors are rows. In my case, most (but not necessarily all) combinations of sector/subsector will be unique (i.e., Sectors A and B do not share subsectors).

Question Summary

Is there a simple way to acheive my objectives (restated for convenience):

  • facet on two factor variables (facet_wrap, not facet_grid)
  • preserve facet order based on factor levels
  • reformat the facet label to a single line with sector and subsector sepearted by a colon

Thanks,


Solution

  • Wow, that was a lot trickier than I expected... One solution would be to combine them into a different field:

    tmp.d |> 
      arrange(sector, subsector) |>          # arrange by factor levels
      mutate(
        facet = 
          paste0(sector, ": ", subsector) |>    
          fct_inorder(ordered = TRUE)        # use that order for the new field
      ) |> 
      ggplot(aes(x = year, y = value, group = 1)) +
      geom_path()+
      facet_wrap(facets = ~facet)            # here
    

    This also works if a ", " is acceptable:

    ggplot(tmp.d, aes(x = year, y = value, group = 1)) +
      geom_path()+
      facet_wrap(
        facets = sector~subsector, 
        labeller = 
          labeller(                  # here
            sector = label_value,    #
            subsector = label_value, #
            .multi_line = FALSE      #
          )
      )
    

    A similar thing can be done with purrr::partial() which substitutes out defaults but again you get a comma. I think it would be worth creating an issue on their github page to add a sep argument to the label_*() functions

    ... +
      facet_wrap(
        facets = sector~subsector, 
        labeller = purrr::partial(label_value, multi_line = FALSE)
      )