I have been trying to work around this error for a bit, but can't seem to resolve it.
I have a combined barplot as seen in the picture and I want to compute the error bars for each time frame (6, 8, 12 weeks and terminal). I used the summarySE function to find the values for each time frame, and this is the code I am using to create the graph along with the errorbars:
limits <- aes (ymax= summarySE_data$mean + summarySE_data$se, ymin = summarySE_data$mean - summarySE_data$se)
p <- ggplot(df, aes(x = df$time_frame, y = df$Engraftment_Efficiency , group = time_frame)) + scale_fill_discrete(breaks = c("six weeks", "eight weeks", "twelve weeks", "terminal point"))
p + geom_bar(stat = "identity", position = position_dodge(0.9)) + geom_errorbar(limits, position = position_dodge(0.9), width = 0.25, group = "Sample." )+ labs(x = NULL , y= "Engraftment Efficiency %") + ggtitle("PBL Engraftment Efficiency")
I have seen other posts regarding a similar error, however they haven't worked for me. Any input on how to resolve this error is appreciated. Thank you!
Here is my main df:
time_frame Engraftment_Efficiency
1 six weeks 49.8
2 six weeks 47.3
3 six weeks 56.1
4 six weeks 36.7
5 six weeks 54.8
6 six weeks 48.0
7 eight weeks 64.7
8 eight weeks 52.0
9 eight weeks 68.1
10 eight weeks 47.2
11 eight weeks 59.1
12 eight weeks 65.5
13 twelve weeks 72.6
14 twelve weeks 55.0
15 twelve weeks 77.3
16 twelve weeks 61.4
17 twelve weeks 73.4
18 twelve weeks 72.6
19 terminal point 69.8
20 terminal point 43.2
and here is the summarySE_data:
time_frame N mean sd se ci
1 six weeks 6 48.78333 6.922259 2.826000 7.264465
2 eight weeks 6 59.43333 8.302690 3.389559 8.713139
3 twelve weeks 6 68.71667 8.572611 3.499754 8.996404
4 terminal point 11 71.48182 20.684817 6.236707 13.896249
You can't use data$
inside aes
. The data comes from the data =
argument, inside aes()
you should just have column names.
Here's my best guess at what will work. Can't test without seeing your data, and your group = "Sample."
is very strange because it is not in aes()
, it is a string, can't tell if it's a column name, so I just deleted it.
p <-
ggplot(
df,
aes(x = time_frame,
y = Engraftment_Efficiency ,
group = time_frame)
) +
scale_fill_discrete(breaks = c("six weeks", "eight weeks", "twelve weeks", "terminal point")) +
geom_bar(stat = "identity", position = position_dodge(0.9)) +
geom_errorbar(
data = summarySE_data,
aes(ymax = mean + se,
ymin = mean - se),
position = position_dodge(0.9),
width = 0.25
) +
labs(x = NULL , y = "Engraftment Efficiency %", title = "PBL Engraftment Efficiency")
This is my best guess for what you want:
ggplot(
summarySE_data,
aes(x = time_frame,
y = mean,
group = time_frame)
) +
geom_bar(stat = "identity", position = position_dodge(0.9)) +
geom_errorbar(
data = summarySE_data,
aes(ymax = mean + se,
ymin = mean - se),
position = position_dodge(0.9),
width = 0.25
) +
labs(x = NULL , y = "Engraftment Efficiency %", title = "PBL Engraftment Efficiency")
Here we only use the summary data frame - the height of the bars is the mean, and the error bars show +/- the standard error. It doesn't seem like the plot needs the raw data frame at all.
If you do use the raw data in another layer, you may have to set inherit.aes = FALSE
in the geom_errobar
layer (and specify its x
aesthetic), otherwise it will look for the y
column in the data, not find it, and complain.