I have 30 plant species for which I have displayed the distributions of midday leaf water potential (lwp_md
) using boxplots and the package ggplot2
. But how do I group these species along the x-axis according to their leaf habits (e.g. Deciduous
, Evergreen
) as well as display a reference line indicating the mean lwp_md
value for each leaf habit level?
I have attempted with the package forcats
but really have no idea how to proceed with this one. I can't find anything after an extensive search online. The best I seem able to do is order species by some other function e.g. the median.
Below is an example of my code so far. Note I have used the packages ggplot2
and ggthemes
:
library(ggplot2)
ggplot(zz, aes(x=fct_reorder(species, lwp_md, fun=median, .desc=T), y=lwp_md)) +
geom_boxplot(aes(fill=leaf_habit)) +
theme_few(base_size=14) +
theme(legend.position="top",
axis.text.x=element_text(size=8, angle=45, vjust=1, hjust =1)) +
xlab("Species") +
ylab("Maximum leaf water potential (MPa)") +
scale_y_reverse() +
scale_fill_discrete(name="Leaf habit",
breaks=c("DEC", "EG"),
labels=c("Deciduous", "Evergreen"))
Here's a subset of my data including 4 of my species (2 deciduous, 2 evergreen):
> dput(zz)
structure(list(id = 1:20, species = structure(c(1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L
), .Label = c("AMYELE", "BURSIM", "CASXYL", "COLARB"), class = "factor"),
leaf_habit = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("DEC",
"EG"), class = "factor"), lwp_md = c(-2.1, -2.5, -2.35, -2.6,
-2.45, -1.7, -1.55, -1.4, -1.55, -0.6, -2.6, -3.6, -2.9,
-3.1, -3.3, -2, -1.8, -2, -4.9, -5.35)), class = "data.frame", row.names = c(NA,
-20L))
An example of how I'm looking to display my data, cut and edited - I would like species
on x-axis, lwp_md
on y-axis:
gpplot
defaults to ordering your factors alphabetically. To avoid this you have to supply them as ordered factors. This can be done by arranging the data.frame
and then redeclaring the factors. To generate the mean value we can use group_by
and mutate a new mean column in the df, that can later be plotted.
Here is the complete code:
library(ggplot)
library(ggthemes)
library(dplyr)
zz2 <- zz %>% arrange(leaf_habit) %>% group_by(leaf_habit) %>% mutate(mean=mean(lwp_md))
zz2$species <- factor(zz2$species,levels=unique(zz2$species))
ggplot(zz2, aes(x=species, y=lwp_md)) +
geom_boxplot(aes(fill=leaf_habit)) +
theme_few(base_size=14) +
theme(legend.position="top",
axis.text.x=element_text(size=8, angle=45, vjust=1, hjust =1)) +
xlab("Species") +
ylab("Maximum leaf water potential (MPa)") +
scale_y_reverse() +
scale_fill_discrete(name="Leaf habit",
breaks=c("DEC", "EG"),
labels=c("Deciduous", "Evergreen")) +
geom_errorbar(aes(species, ymax = mean, ymin = mean),
size=0.5, linetype = "longdash", inherit.aes = F, width = 1)