I am trying to plot 35 individual time series data (102 data points each) using ggplot
and geom_line
. I'd also like to overlap the grand mean of the individual data across time as a second geom_line
that is either a different color or different alpha.
Here is a sample from my data:
> dput(head(mdata, 10))
structure(list(Individual = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), Signal = c(-0.132894911, -0.13, 0, 0, 0, 0.02, 0.01,
0.01, 0, 0.02), Time = c(0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7,
0.8, 0.9)), row.names = c(NA, 10L), class = "data.frame")
I've done this before with summarySE
, however, it is no longer compatible the current version of R. I've tried to use two separate data frames (one with the individual data and one with the mean data) and overlay those data but I think because I've melted the individual data (from 35x102 data frame to a 3x3570), I am getting an error that says:
"Aesthetics must be either length 1 or the same as the data (102): group".
Then, I've tried using stat_summary
and fun.data
but I am still getting errors that says:
Error: geom_line requires the following missing aesthetics: y
ggplot(data=mdata,aes(x=Time, y=Signal, group=Individual, ymin=-1, ymax=3))+
geom_line()+
stat_summary(fun.data="mean", geom="line", color = "red")
Here is a dropbox link to the example data frame and graph I need as an output.
Any advice would be greatly appreciated! I've seen similar problems elsewhere, but I think the fact I am grouping my data within the aesthetic is causing me problems.
You can add a layer geom_line()
from the summary data frame.
# Let's create the summary using `dplyr'
library(dplyr)
avg_group <- mdata %>%
select(Individual, Signal, Time) %>%
group_by(Individual) %>%
summarise(avg_ind = mean(Time), avg_sig = mean(Signal))
# -------------------------------------------------------------------------
# > avg_group
# # A tibble: 35 x 3
# Individual avg_ind avg_sig
# <int> <dbl> <dbl>
# 1 1 5.05 0.107
# 2 2 5.05 0.0947
# 3 3 5.05 0.0781
# 4 4 5.05 0.0362
# 5 5 5.05 0.0156
# 6 6 5.05 0.0182
# 7 7 5.05 0.774
# 8 8 5.05 0.297
# 9 9 5.05 0.517
# 10 10 5.05 0.685
# # … with 25 more rows
# -------------------------------------------------------------------------
# Then plot the graph using
ggplot(mdata,aes(x=Time, y=Signal, group=Individual, ymin=-1, ymax=3))+
geom_line() +
geom_line(data = avg_group, aes(avg_ind, avg_sig), group = 1, color = "red") + theme_bw()
# -------------------------------------------------------------------------
If you prefer stat_summary()
what you can do is to add an explicit variable common to the dataframe and use that as a grouping aesthetic
. You can do that as follows:
# > head(mdata, 2)
# Individual Signal Time
# 1 1 -0.1328949 0.0
# 2 1 -0.1300000 0.1
# ------------------------------------------------------------------------
mdata$grand <- 1
# > head(mdata, 2)
# Individual Signal Time grand
# 1 1 -0.1328949 0.0 1
# 2 1 -0.1300000 0.1 1
# ------------------------------------------------------------------------
# plot using grand as an explicit variable used to group the plot
ggplot(mdata,aes(x=Time, y=Signal, group=Individual, ymin=-1, ymax=3))+
geom_line() + stat_summary(aes(group = grand), fun.y="mean", geom="line", color = "red") + theme_bw()
To make something like the output you expect (as shown in the link you shared),
ggplot(data=mdata,aes(x=Time, y=Signal, group=Individual, ymin=-1, ymax=3))+
geom_line()+
geom_rect(xmin = (mean(mdata$Time) + se(mdata$Time)) , xmax =xmin + 0.4, fill = "red", ymax = -0.94, ymin = -1) + theme_bw()
There is a warning to this output as all is not coming from the data, though the grand mean and standard error are used to plot the rectangle.
You may refer here for the se
function.