I am creating plots similar to the first example image below, and need plots like the second example below.
library(ggplot2)
library(scales)
# some data
data.2015 = data.frame(score = c(-50,20,15,-40,-10,60),
area = c("first","second","third","first","second","third"),
group = c("Findings","Findings","Findings","Benchmark","Benchmark","Benchmark"))
data.2014 = data.frame(score = c(-30,40,-15),
area = c("first","second","third"),
group = c("Findings","Findings","Findings"))
# breaks and limits
breaks.major = c(-60,-40,-22.5,-10, 0,10, 22.5, 40, 60)
breaks.minor = c(-50,-30,-15,-5,0, 5, 15,30,50)
limits =c(-70,70)
# plot 2015 data
ggplot(data.2015, aes(x = area, y = score, fill = group)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9)) +
coord_flip() +
scale_y_continuous(limit = limits, oob = squish, minor_breaks = breaks.minor, breaks = breaks.major)
The data.2014 has only values for the "Findings" group. I would like to show those 2014 Findings values on the plot, on the appropriate/corresponding data.2015$area, where there is 2014 data available.
To show last year's data just on the "Finding" (red bars) data, I'd like to use a one-sided errorbar/whisker that emanates from the value of the relevant data.2015 bar, and terminates at the data.2014 value, for example:
I thought to do this by using layers and plotting error bars so that the 2015 data could overlap, however this doesn't work when the 2014 result is abs() smaller than the 2015 result and is thus occluded.
Considerations:
So I've added to the below solution, I used that exact code but instead used the geom_linerange
so that it would add lines without the caps, then I also used the geom_errorbar
, but with ymin and ymax set to the same value, so that the result is a one-sided error bar in ggplot
geom_bar
! Thanks for the help.
I believe you can get most of what you want with a little data manipulation. Doing an outer join of the two datasets will let you add the error bars with the appropriate dodging.
alldat = merge(data.2015, data.2014, all = TRUE, by = c("area", "group"),
suffixes = c(".2015", ".2014"))
To make the error bar one-sided, you'll want ymin
to be either the same as y
or NA
depending on the group. It seemed easiest to make a new variable, which I called plotscore
, to achieve this.
alldat$plotscore = with(alldat, ifelse(is.na(score.2014), NA, score.2015))
The last thing I did is to make a variable direction
for when the 2015 score decreased vs increased compared to 2014. I included a third category for the Benchmark
group as filler because I ran into some issues with the dodging without it.
alldat$direction = with(alldat, ifelse(score.2015 < score.2014, "dec", "inc"))
alldat$direction[is.na(alldat$score.2014)] = "absent"
The dataset used for plotting would look like this:
area group score.2015 score.2014 plotscore direction
1 first Benchmark -40 NA NA absent
2 first Findings -50 -30 -50 dec
3 second Benchmark -10 NA NA absent
4 second Findings 20 40 20 dec
5 third Benchmark 60 NA NA absent
6 third Findings 15 -15 15 inc
The final code I used looked like this:
ggplot(alldat, aes(x = area, y = score.2015, fill = group)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9)) +
geom_errorbar(aes(ymin = plotscore, ymax = score.2014, color = direction),
position = position_dodge(width = .9), lwd = 1.5, show.legend = FALSE) +
coord_flip() +
scale_y_continuous(limit = limits, oob = squish, minor_breaks = breaks.minor, breaks = breaks.major) +
scale_color_manual(values = c(NA, "red", "green"))
I'm using the development version of ggplot2, ggplot2_1.0.1.9002, and show_guide
is now deprecated in favor of show.legend
, which I used in geom_errorbar
.
I obviously didn't change the line type of the error bars to dashed with a solid cap, nor did I remove the bottom whisker as I don't know an easy way to do either of these things.