I am often unsure of exactly which elements of data, attributes, and other components of a graphical object in ggplot2
are inherited by which other elements, and where the defaults that flow down to, e.g., geoms, originate. In particular cases these questions can generally be answered by close reading of Hadley's ggplot2
book. But I would find it useful to have some sort of visualization of the overall flow of inheritance in ggplot2
, and I wonder if anyone has seen, or created, or knows how to create, such a thing. I the same vein, a compact list of default values which arise in one level of specification (like the aes or a theme) and are inherited by another level (like a geom or scale) would be very useful to me, and I suspect to many people learning how to use ggplot2.
I would accept any of the following as an answer:
This question seems to be about multiple levels of the ggplot package at once, but I'll try my best to give some information. It's almost impossible to describe the entire inheritance system of ggplot in one stack overflow answer, but pointing at the right functions might help get you started.
At the top level, both data and aesthetic mappings are inherited from the main ggplot call. In code below, geom_point()
inherits the mapping and data:
ggplot(iris, aes(Sepal.Width, Sepal.Length)) +
geom_point()
Unless you explicitly provide an alternative mapping and set the inheritance to false:
ggplot(iris, aes(Sepal.Width, Sepal.Length)) +
geom_point(aes(Petal.Width, Petal.Length), inherit.aes = FALSE)
Next, at the level of individual layers, certain defaults are inherited from either the stats, geoms or positions. Consider the following plot:
df <- reshape2::melt(volcano)
ggplot(df, aes(Var1, Var2)) +
geom_raster()
The raster will be a dark grey colour, because we haven't specified a fill mapping. You can get a sense what defaults a geom / stat has by looking at their ggproto
objects:
> GeomRaster$default_aes
Aesthetic mapping:
* `fill` -> "grey20"
* `alpha` -> NA
> StatDensity$default_aes
Aesthetic mapping:
* `y` -> `stat(density)`
* `fill` -> NA
Another key ingredient for understanding how layers are given their parameters, is looking at the layer()
code. Specifically, this bit here (abbreviated for clarity):
function (geom = NULL, stat = NULL, data = NULL, mapping = NULL,
position = NULL, params = list(), inherit.aes = TRUE, check.aes = TRUE,
check.param = TRUE, show.legend = NA, key_glyph = NULL,
layer_class = Layer)
{
...
aes_params <- params[intersect(names(params), geom$aesthetics())]
geom_params <- params[intersect(names(params), geom$parameters(TRUE))]
stat_params <- params[intersect(names(params), stat$parameters(TRUE))]
...
ggproto("LayerInstance", layer_class, geom = geom, geom_params = geom_params,
stat = stat, stat_params = stat_params, data = data,
mapping = mapping, aes_params = aes_params, position = position,
inherit.aes = inherit.aes, show.legend = show.legend)
}
Wherein you can see that whatever parameters you give, they are checked against valid parameters of the stat/geom/position and distributed to the appropriate part of the layer. As you can see from the last call, a Layer ggproto object is created. The parent of this class is not exported but you can still inspect the functions inside that object. For example, if you are curious about how aesthetics are evaluated, you can type:
ggplot2:::Layer$compute_aesthetics
Wherein you can see that some defaults of the scales are incorporated here as well. Of course, it doesn't make much sense what these layers do if you don't understand the order of operations in which these layer functions are called. For that we can have a look at the plot builder (also abbreviated for clarity):
> ggplot2:::ggplot_build.ggplot
function (plot)
{
...
data <- by_layer(function(l, d) l$setup_layer(d, plot))
...
data <- by_layer(function(l, d) l$compute_aesthetics(d, plot))
data <- lapply(data, scales_transform_df, scales = scales)
...
data <- layout$map_position(data)
data <- by_layer(function(l, d) l$compute_statistic(d, layout))
data <- by_layer(function(l, d) l$map_statistic(d, plot))
scales_add_missing(plot, c("x", "y"), plot$plot_env)
data <- by_layer(function(l, d) l$compute_geom_1(d))
data <- by_layer(function(l, d) l$compute_position(d, layout))
...
data <- by_layer(function(l, d) l$compute_geom_2(d))
data <- by_layer(function(l, d) l$finish_statistics(d))
...
structure(list(data = data, layout = layout, plot = plot),
class = "ggplot_built")
}
From this, you can see that layers are setup first, then aesthetics are computed, then scale transformations are applied, then statistics are computed, then a part of the geom is computed, then the position, and finally the reset of the geom.
What this means is that the statistical transformations you put in are going to be affected by scale transformations, but not coord transformations (which is elsewhere later).
If you go through the code, you'll find that almost nothing theme-related is evaluated up untill this point (except for some theme evaluation with the facets). As you can see, the building function returns an object of the class ggplot_build
, which is still not graphical output. The interpretation of theme elements and actual interpretation of the geoms towards grid graphics happens in the following function:
ggplot2:::ggplot_gtable.ggplot_built
After this function, you'll have a gtable object that can be interpreted by grid::grid.draw()
which will output to your graphics device.
Unfortunately, I'm not very well-versed in the inheritance of theme elements, but as Jon Spring pointed out in the comments, a good place to start is the documentation. Hopefully, I've pointed out functions where to look for inheritance patterns in ggplot.