I have a data set that has observations of two variables (incidence (0-100) and severity (0-5) across 5 years. It looks something like this.
cbb.incidence avg.severity Year
1 86.666667 2.0333333 2009
2 83.333333 1.8666667 2009
3 20.000000 1.2000000 2009
4 26.666667 1.2666667 2010
5 86.666667 1.9000000 2010
6 86.666667 1.8666667 2010
7 86.666667 2.0333333 2011
8 83.333333 1.8666667 2011
9 20.000000 1.2000000 2012
10 26.666667 1.2666667 2012
11 86.666667 1.9000000 2013
12 86.666667 1.8666667 2013
What I want to get is a figure with two box-plots for each year, one of each variable. I found my exact same question here on stack overflow: Plot multiple boxplot in one graph
So I "melt" the data as they describe in the example, and then plot it as decribed:
meltedData<-melt(incidence_all, id.var='Year')
ggplot(data=meltedData, aes(x=Year, y=value)) +
geom_boxplot(aes(fill=variable))
The data appears to be in the correct format The melted data looks like this (this is a subset, there are >2000 rows):
Year variable value
1017 2009 avg.severity 1.5333333
1018 2009 avg.severity 2.1333333
1019 2009 avg.severity 2.0666667
1020 2009 avg.severity 2.0000000
1021 2009 avg.severity 2.0666667
1022 2009 avg.severity 1.6333333
1023 2009 avg.severity 1.5666667
1024 2009 cbb.incidence 16.777775
1025 2009 cbb.incidence 35.888865
Will one you R-wizards please tell me what I'm doing wrong?
ALSO, I know already that my two variables are on very different scales (incidence is from 0-100, and severity is 1-5) so if I simply plot both with the same y-axis scale the smaller values will be un-readable. I would like have a double y-axis, one on the left and one on the right, with each variable being scaled to a different y-axis. I have not seen a box-plot example with this feature. Can someone make a recommendation of how to approach this, preferably in ggplot?
THANK YOU!!
Try making Year as factor first:
incidence_all$Year=factor(incidence_all$Year)
meltedData<-melt(incidence_all, id.var="Year")
ggplot(data=meltedData, aes(x=Year, y=value)) +
geom_boxplot(aes(fill=variable))
You will get something like this:
For the second question, one alternative would be to rescale:
incidence_all$avg.severitys=incidence_all$avg.severity*100/max(incidence_all$avg.severity)
meltedData<-melt(incidence_all[,-2], id.var="Year")
ggplot(data=meltedData, aes(x=Year, y=value)) +
geom_boxplot(aes(fill=variable))