Search code examples
rggplot2boxplotaxis-labels

Using ggplot2, how can I label the x axis by only one of the two factors used to create a grouped boxplot?


My question is simple, I think, but Googling and fussing have gotten me nowhere. Hopefully, someone here can help. I have data that was collected from two different treatment groups over three years. I want to generate a boxplot for each combination of treatment and year (to see if the treatment effect varied by year). However, I can't figure out how to label the x axis the way I want to using ggplot2.

Here's some code that creates some faked data kind of similar to my own.

fakeX = factor(rep(c(0,1), times=60))
fakeY = factor(rep(c(1,2,3), each=40))
fakeZ = round(runif(120, 0, 12), digits=1)
df1 = data.frame(fakeX, fakeY, fakeZ)
df1$inter = interaction(df1$fakeX, df1$fakeY)

ggplot(df1, aes(x=inter, y = fakeZ)) + 
stat_boxplot(geom='errorbar', lwd=1.75) + 
geom_boxplot(outlier.size=5, lwd = 1.75, fatten=1.25, 
fill = rep(c("dim gray", "ivory3"), times=3), aes(fill=fakeY)) +
scale_x_discrete(labels = c("2013", "", "2014", "", "2015", ""))

This code produces a set of boxplots like those I'm looking for:

enter image description here

So far so good. However, as you can see, I have treatments labeled by color already (the figure caption will explain which is which), so I just need my axis labels to say which year each set of two boxplots is from. However, I can't figure out the code that will get the year labels centered exactly in between the two boxplots each is for, which is what I think would be the most clear thing to do. Right now, the best I've been able to do is put the year label with the first boxplot from each year. I've tried using the breaks argument inside of scale_x_discrete, to no avail. I've also tried passing a position=position_nudge() call to scale_x_discrete, but that didn't work either. If I add spaces in front of my axis labels, then I can fake what I'm looking for, but I'd have to use trial and error to get the number of spaces needed exactly right for each graph, and it seems like there has to be a better way. Any thoughts?


Solution

  • An more specific example than my comment above:

    ggplot(df1,aes(x = fakeY,y = fakeZ,fill = fakeX)) + 
        stat_boxplot(geom = "errorbar",position = "dodge",lwd = 1.75) +
        geom_boxplot(position = "dodge",lwd = 1.75,fatten = 1.25,show.legend = FALSE) + 
        scale_fill_manual(values = c("dim gray", "ivory3")) + 
        scale_x_discrete(labels = c('2013','2014','2015'))