Search code examples
rggplot2facetgeom-hline

Facetting with factorised variables and geom_hline / geom_vline


Consider this code:

require(ggplot2)

ggplot(data = mtcars) +
  geom_point(aes(x = drat, y = wt)) +
  geom_hline(yintercept = 3) +
  facet_grid(~ cyl)                       ## works

ggplot(data = mtcars) +
  geom_point(aes(x = drat, y = wt)) +
  geom_hline(yintercept = 3) +
  facet_grid(~ factor(cyl))              ## does not work

# Error in factor(cyl) : object 'cyl' not found

# removing geom_hline: works again. 

Google helped me to find a debug, namely wrapping intercept into aes

ggplot(data = mtcars) +
  geom_point(aes(x = drat, y = wt)) +
  geom_hline(aes(yintercept = 3)) +
  facet_grid(~ factor(cyl))                  # works

# R version 3.4.3 (2017-11-30)  
# ggplot2_2.2.1

Hadley writes here that functions as variables need to be in every layer. (which sounds mysterious to me)

Why does this happen when factorising the facet variable?


Solution

  • So here's my best guess and explanation.

    When Hadley says:

    This is a known limitation of facetting with a function - the variables you use have to be present on every layer.

    He means in ggplot, when you're going to use a function in the facetting function, you need to have the variable in every geom. The issue occurs because there cyl variable is not present in the hline geom.

    It's important to remember, this is a limitation, not ideal behaviour. Moreso, a consequence of how their efficient code works, is that when using functions to facet, the variables must be present in every geom.

    Without looking into the specifics of the ggplot2 functions, I'm guessing what wrapping aes around the yintercept argument does, is give an aesthetic mapping to the geom_hline function. The aes function maps variables to components of the plot, rather than static values. It's an important distinction. Even though we still set yintercept = 3, the fact that we have placed it in the aesthetic mapping, must somehow reference that cyl also exists in this space. That is, it connects geom_hline indirectly with cyl, meaning it's now in the layer, and no longer a limitation.

    This may not be an entirely satisfying answer, but without reading over the ggplot2 code to try and work out specifically why this limitation occurs, this might be as good as you'll get for now. Hopefully one of these workarounds is sufficient for you :)