Main issue: I want to display the data from 0 to 1.0 as an upward bar (starting from 0) but do not want the intervals to be equally spaced but log spaced.
I am trying to display the column labeled "mean" in the dataset below as a bar plot in ggplot but as the numbers are very small, I would like to show the y-axis on a log scale rather than log transform the data itself. In other words, I want to have upright bars with y-axis labels as 0, 1e-8, 1e-6 1e-4 1e-2 and 1e-0 (i.e. from 0 to 1.0 but the intervals are log scaled).
The solution below does not work as the bars are inverted.
> print(df)
type mean sd se snp
V7 outer 1.596946e-07 2.967432e-06 1.009740e-08 A
V8 outer 7.472417e-07 6.598652e-06 2.245349e-08 B
V9 outer 1.352327e-07 2.515771e-06 8.560512e-09 C
V10 outer 2.307726e-07 3.235821e-06 1.101065e-08 D
V11 outer 4.598375e-06 1.653457e-05 5.626284e-08 E
V12 outer 5.963164e-07 5.372226e-06 1.828028e-08 F
V71 middle 2.035414e-07 3.246161e-06 1.104584e-08 A
V81 middle 9.000131e-07 7.261463e-06 2.470886e-08 B
V91 middle 1.647716e-07 2.875840e-06 9.785733e-09 C
V101 middle 3.290817e-07 3.886779e-06 1.322569e-08 D
V111 middle 6.371170e-06 1.986268e-05 6.758752e-08 E
V121 middle 8.312429e-07 6.329386e-06 2.153725e-08 F
The code below properly generates the grouped barplot with error bars
ggplot(data=df, aes(x=snp,y=mean,fill=type))+
geom_bar(stat="identity",position=position_dodge(),width=0.5) +
geom_errorbar(aes(ymin=mean-se, ymax=mean+se),width=.3, position=position_dodge(.45))
However, I want to make the y-axis log scaled and so I add in scale_y_log10() as follows:
ggplot(data=df, aes(x=snp,y=mean,fill=type))+
geom_bar(stat="identity",position=position_dodge(),width=0.5) + scale_y_log10() +
geom_errorbar(aes(ymin=mean-se, ymax=mean+se),width=.3, position=position_dodge(.45))
But strangely the bars are falling from above but I simply want them to be going up (as normally) and don't know what I am doing wrong.
Thank you
Here's a bit of hacking to show what happens if you try to get bars that start at zero on a log scale. I've used geom_segment
for illustration, so that I can create "bars" (wide line segments, actually) extending over arbitrary ranges. To make this work, I've also had to do all the dodging manually, which is why the x
mapping looks weird.
In the example below, the scale goes from y=1e-20 to y=1. The y-axis intervals are log scaled, meaning that the physical distance from, say 1e-20 to 1e-19 is the same as the physical distance from, say, 1e-8 to 1e-7, even though the magnitudes of those intervals differ by a factor of one trillion.
Bars that go down to zero can't be displayed, because zero on the log scale is an infinite distance below the bottom of the graph. We could get closer to zero by, for example, changing 1e-20
to 1e-100
in the code below. But that will just make the already-small physical distances between the data values even smaller and thus even harder to distinguish.
The bars are also misleading in another way, because, as @hrbrmstr pointed out, our brains treat distance along the bar linearly, but the magnitude represented by each increment of distance along the bar changes by a factor of 10 about every few millimeters in the example below. The bars simply aren't encoding meaningful information about the data.
ggplot(data=df, aes(x=as.numeric(snp) + 0.3*(as.numeric(type) - 1.5),
y=mean, colour=type)) +
geom_errorbar(aes(ymin=mean-se, ymax=mean+se), width=.3) +
geom_segment(aes(xend=as.numeric(snp) + 0.3*(as.numeric(type) - 1.5),
y=1e-20, yend=mean), size=5) +
scale_y_log10(limits=c(1e-20, 1), breaks=10^(-100:0), expand=c(0,0)) +
scale_x_continuous(breaks=1:6, labels=LETTERS[1:6])
If you want to stick with a log scale, maybe plotting points would be a better approach:
pd = position=position_dodge(.5)
ggplot(data=df, aes(x=snp,y=mean,fill=type))+
geom_errorbar(aes(ymin=mean-se, ymax=mean+se, colour=type), width=.3, position=pd) +
geom_point(aes(colour=type), position=pd) +
scale_y_log10(limits=c(1e-7, 1e-5), breaks=10^(-10:0)) +
annotation_logticks(sides="l")