Consider the following data:
set.seed(4235)
dates <- c("2016-01-01", "2015-01-01", "2014-01-01", "2013-01-01")
small <- data.frame(group = "small", n1 = rnorm(4), dates = as.Date(dates))
medium <- data.frame(group = "medium", n1 = rnorm(4), dates = as.Date(dates))
large <- data.frame(group = "large", n1 = rnorm(4), dates = as.Date(dates))
data <- rbind(small, medium, large)
This is pretty basic data that can be plotted like this:
ggplot(data, aes(dates, col = group)) +
geom_line(aes(y = n1))
However, imagine that I want to plot the small and medium group against the large group. In other words the difference between the small and medium group and the large. In other words the large group should be represented by a straight line around zero and the other groups should represent the difference. Something like a autocovariance plot.
Any idea on how to do this with ggplot?
You probably can't do it with ggplot
directly, though it is relatively straightforward to calculate the differences first, then pass them back to ggplot
.
Here, I am using tidyr
and dplyr
to do the manipulations. First, I spread the data to get the groups in their own columns (with one row per date) to allow comparison. Then, I mutate
to create the difference variables of interest. Finally, I gather
the comparisons back into long form (be warned that this duplicates the entries in small
, medium
, and large
; however, those can be dropped with select
if needed). Then, simply pass the result to ggplot
and plot however you desire (here, simple lines again).
data %>%
spread(group, n1) %>%
mutate(large - medium
, large - small) %>%
gather(Comparison, Difference, `large - medium`, `large - small`) %>%
ggplot(aes(x = dates
, y = Difference
, col = Comparison)) +
geom_line()
gives: