I'm trying to plot a distribution CDF using R and ggplot2. However, I am finding difficulties in plotting the CDF function after I transform the Y axis to obtain a straight line. This kind of plot is frequently used in Gumbel paper plots, but here I'll use as example the normal distribution.
I generate the data, and plot the cumulative density function of the data along with the function. They fit well. However, when I apply an Y axis transformation, they don't fit anymore.
sim <- rnorm(100) #Simulate some data
sim <- sort(sim) #Sort it
cdf <- seq(0,1,length.out=length(sim)) #Compute data CDF
df <- data.frame(x=sim, y=cdf) #Build data.frame
library(scales)
library(ggplot2)
#Now plot!
gg <- ggplot(df, aes(x=x, y=y)) +
geom_point() +
stat_function(fun = pnorm, colour="red")
gg
And the output should be something on the lines of: Good!
Now I try to transform the Y axis according to the distribution used.
#Apply transformation
gg + scale_y_continuous(trans=probability_trans("norm"))
The points are transformed correctly (they lie on a straight line), but the function is not!
However, everything seems to work fine if I do like this, calculating the CDF with ggplot:
ggplot(data.frame(x=sim), aes(x=x)) +
stat_ecdf(geom = "point") +
stat_function(fun="pnorm", colour="red") +
scale_y_continuous(trans=probability_trans("norm"))
Why is this happening? Why doesn't calculating the CDF manually work with scale transformations?
This works:
gg <- ggplot(df, aes(x=x, y=y)) +
geom_point() +
stat_function(fun ="pnorm", colour="red", inherit.aes = FALSE) +
scale_y_continuous(trans=probability_trans("norm"))
gg
Possible explanation:
Documentation States:
inherit.aes
If FALSE, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. borders.
My guess:
As scale_y_continuous
changes the aesthetics of the main plot, we need to turn off the default inherit.aes=TRUE
. It seems inherit.aes=TRUE
in stat_function
picks its aesthetics from the first layer of the plot, and so the scale transformation does not impact unless specifically chosen to.