I am trying to make a boxplot of quite a large database to illustrate the temperature span (weather variability) per day of the year:
boxplot(Datasubset$Temp~Datasubset$Day,las=2,data=Datasubset,main="Weather Variability",xlab=names(Datasubset)[1],ylab=names(Datasubset)[3])
The Datasubset data frame looks like this:
Day Hour Temp
1/1/2015 1 3
2/1/2015 2 4
[...] [...] [...]
31/12/2015 8760 2
However, my x axis show too many values which means they overlap and become unreadable. Would it be possible to regulate the frequency of the x axis labels? For example, only show a label every 10 or 20 days?
Also, my xlab and ylab commands seem a bit artificial. Is it possible to reference to the name of the column in the data frame in a more natural way?
These are probably simple things but I couldn't seem to find answers in ?boxplot.
Thank you in advance.
You can accomplish this by suppressing the x axis and then creating your own custom axis. For example,
# Create example data similar to what you described:
Datasubset <- data.frame(Day=as.Date(16436:16800, origin='1970-01-01'),
Temp=sample(1:10, 365, replace=TRUE))
# Make the boxplot, without x-axis ticks by specifying xaxt='n':
boxplot(Temp ~ Day, data=Datasubset, las=2, main="Weather Variability",
xlab='Day', ylab='Temp', xaxt='n')
# Make a vector of values to draw ticks at:
ticks <- seq(from=1, to=365, by=90)
# And draw the axis:
axis(1, las=1, at=ticks, labels=Datasubset$Day[ticks])
creates the following plot:
For more information see the help pages for par
(specifically the xaxt
option) and axis
by executing help('par')
or help('axis')
.