I am trying to plot some data with some shadow rectangles.
The dataframe df
looks like this:
df <- data.frame(time = seq(0.1, 2, 0.1),
speed = c(seq(0.5, 5, 0.5), seq(5, 0.5, -0.5)),
type = c("a", "a", "b", "b", "b", "b", "c", "c", "c", "b", "b", "b", "b", "b", "c", "a", "b", "c", "b", "b"))
For the rectangles in the plot I am defining an object called dfRect
with the variables xmin
and xmax
.
dfRect <- data.frame(xmin = c(0.3, 1.0, 1.9), xmax = c(0.7, 1.5, 2.0))
The problem is I have to find xmin
and xmax
manually for the start and the end of the rectangles. A rectangle starts (xmin
) at the start of a time sequence of b
in column type
and ends at the end of the same time sequence of b
. Single b
s can be ignored.
Here is the plot so you get an Idea what I am trying to accomplish:
ggplot() +
geom_rect(data = dfRect,
aes(xmin = xmin, xmax = xmax, ymin = -Inf, ymax = Inf),
fill = "yellow", alpha = 0.4) +
geom_line(data = df, aes(x = time, y = speed, color = type, group = 1), size = 3)
So in the end the question is. How can I automate the process of defining xmin
and xmax
and create dfRect automatically so I don't have to define it by myself?
Here is an approach using run length encoding.
library(ggplot2)
df <- data.frame(time = seq(0.1, 2, 0.1),
speed = c(seq(0.5, 5, 0.5), seq(5, 0.5, -0.5)),
type = c("a", "a", "b", "b", "b", "b", "c", "c", "c", "b", "b", "b", "b", "b", "c", "a", "b", "c", "b", "b"))
# Convert to runlength encoding
rle <- rle(df$type == "b")
# Ignoring the single "b"s
rle$values[rle$lengths == 1 & rle$values] <- FALSE
# Determine starts and ends
starts <- {ends <- cumsum(rle$lengths)} - rle$lengths + 1
# Build a data.frame from the rle
dfrect <- data.frame(
xmin = df$time[starts],
# We have to +1 the ends, because the linepieces end at the next datapoint
# Though we should not index out-of-bounds, so we need to cap at the last end
xmax = df$time[pmin(ends + 1, max(ends))],
fill = rle$values
)
This plot gives an idea what we've been doing in the code above:
ggplot() +
geom_rect(data = dfrect,
aes(xmin = xmin, xmax = xmax, ymin = -Inf, ymax = Inf, fill = fill),
alpha = 0.4) +
geom_line(data = df, aes(x = time, y = speed, color = type, group = 1), size = 3)
To get what you want you'd need to filter out the FALSE
s.
ggplot() +
geom_rect(data = dfrect[dfrect$fill,],
aes(xmin = xmin, xmax = xmax, ymin = -Inf, ymax = Inf),
alpha = 0.4, fill = "yellow") +
geom_line(data = df, aes(x = time, y = speed, color = type, group = 1), size = 3)
If you are looking for a stat that can calculate this for you, have a look here. Disclaimer: I wrote that function, which does a similar thing to the code I posted above.