I have a function which calculates the median and 90% CI using a moving window. So for each x = seq(xmin, xmax, by = wStep)
, I return the median and 5% and 95% quantiles for all y
whose x
values are less that wSize/2
away. I want to display this as a line and ribbon using ggplot2 by creating a custom smoothing function, stat_movingwindow()
. I can create the result I want using geom_smooth(data = ..., stat = "identity")
:
moveWin <- function(d, wSize = 0.5, wStep = 0.1,
f = function(x) quantile(x, prob = c(0.05,0.50,0.95), na.rm = TRUE)
){
x <- seq(min(d$x), max(d$x), by = wStep)
y <- matrix(NA, ncol = 3, nrow = length(x))
for(i in seq_along(x)){
y[i, ] <- f(d[abs(d$x - x[i]) < wSize/2, ]$y)
}
y <- as.tibble(y)
colnames(y) <- c("ymin","y","ymax")
y$x <- x
return(as.tibble(y))
}
set.seed(123)
d <- tibble(
x= sqrt(seq(0,1,length.out = 50))*10,
y= rnorm(50)
)
ggplot(data = d) + aes(x = x, y = y) +
geom_smooth(
data = function(d) moveWin(d, wSize = 1, wStep = 0.1),
mapping = aes(ymin = ymin, ymax= ymax),
stat = "identity") +
geom_point() + scale_x_continuous(breaks = 1:10)
Following the Vignette Extending ggplot2, this is the code I've come up with so far. However, the problem is that this does not show the ribbon. Maybe I need some way to declare that this custom stat is providing the aesthetics ymin and ymax. How do I get the following code to output the similar result as above?
StatMovingWindow <- ggproto("StatMovingWindow", Stat,
compute_group = function(data, scales, wSize, wStep, fun) {
moveWin(data, wSize = wSize, wStep = wStep, f = fun)
},
required_aes = c("x", "y")
)
stat_movingwindow <- function(mapping = NULL, data = NULL,
fun = function(d) quantile(d, probs = c(0.05, 0.50, 0.95), na.rm = TRUE),
wStep = 0.1, wSize = 1,
geom = "smooth", position = "identity", show.legend = NA, inherit.aes = TRUE,
...
){
layer(
stat = StatMovingWindow, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(wStep = wStep, wSize = wSize, fun = fun, ...)
)
}
ggplot(data = d) + aes(x = x, y = y) +
stat_movingwindow(wStep = 0.1, wSize = 1) +
geom_point() + scale_x_continuous(breaks = 1:10)
In your code for stat_movingwindow
, the line for the corresponding geom is geom = "smooth"
:
stat_movingwindow <- function(mapping = NULL, data = NULL,
fun = function(d) quantile(d, probs = c(0.05, 0.50, 0.95), na.rm = TRUE),
wStep = 0.1, wSize = 1,
geom = "smooth", # <- look here
position = "identity", show.legend = NA, inherit.aes = TRUE,
...
){
layer(
stat = StatMovingWindow, data = data, mapping = mapping, geom = geom,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list(wStep = wStep, wSize = wSize, fun = fun, ...)
)
}
Checking the code for geom_smooth
, we see that it includes the parameter se = TRUE
, and uses GeomSmooth
as its geom:
> geom_smooth
function (mapping = NULL, data = NULL, stat = "smooth", position = "identity",
..., method = "auto", formula = y ~ x, se = TRUE, # <- look here
na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE)
{
params <- list(na.rm = na.rm, se = se, ...)
if (identical(stat, "smooth")) {
params$method <- method
params$formula <- formula
}
layer(data = data, mapping = mapping, stat = stat, geom = GeomSmooth, # <- and here
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = params)
}
Digging deeper into GeomSmooth, we see that its draw_group
function (which is responsible for plotting the smoothed line) has se = FALSE
as its default parameter.
From the code, if se == FALSE
, has_ribbon
would be FALSE
as well even though both ymax
& ymin
exists in your data you thanks to the StatMovingWindow$compute_group
function. And this in turn means the only the result of GeomLine$draw_panel(path, panel_params, coord)
would be returned alone, without GeomRibbon$draw_group(ribbon, panel_params, coord)
.
> GeomSmooth$draw_group
<ggproto method>
<Wrapper function>
function (...)
f(...)
<Inner function (f)>
function (data, panel_params, coord, se = FALSE) # <- look here
{
ribbon <- transform(data, colour = NA)
path <- transform(data, alpha = NA)
has_ribbon <- se && !is.null(data$ymax) && !is.null(data$ymin) # <- and here
gList(if (has_ribbon) GeomRibbon$draw_group(ribbon, panel_params, coord),
GeomLine$draw_panel(path, panel_params, coord))
}
In short, geom_smooth
's default parameter of se = TRUE
overrides the default behaviour in GeomSmooth$draw_group
, (the same holds for stat_smooth
too) and we should do the same in stat_movingwindow
if we want to achieve the same result.
If you expect you'll usually want the ribbon to be plotted, you can include se = TRUE
as a parameter in your definition for stat_movingwindow
. If it's going to be on an ad hoc basis, you can include it whenever necessary in your code instead.