Aesthetics must be either length 1 or the same as the data: ymax, ymin, x, y

I have been trying to work around this error for a bit, but can't seem to resolve it.

I have a combined barplot as seen in the picture and I want to compute the error bars for each time frame (6, 8, 12 weeks and terminal). I used the summarySE function to find the values for each time frame, and this is the code I am using to create the graph along with the errorbars:

limits <- aes (ymax= summarySE_data$mean + summarySE_data$se, ymin = summarySE_data$mean - summarySE_data$se)

p <- ggplot(df, aes(x = df$time_frame,  y = df$Engraftment_Efficiency , group = time_frame)) + scale_fill_discrete(breaks = c("six weeks", "eight weeks", "twelve weeks", "terminal point"))

p + geom_bar(stat = "identity", position = position_dodge(0.9)) + geom_errorbar(limits, position = position_dodge(0.9), width = 0.25, group = "Sample." )+ labs(x = NULL , y= "Engraftment Efficiency %") + ggtitle("PBL Engraftment Efficiency")

I have seen other posts regarding a similar error, however they haven't worked for me. Any input on how to resolve this error is appreciated. Thank you!

Here is my main df:

    time_frame Engraftment_Efficiency
1       six weeks                     49.8
2       six weeks                     47.3
3       six weeks                     56.1
4       six weeks                     36.7
5       six weeks                     54.8
6       six weeks                     48.0
7     eight weeks                     64.7
8     eight weeks                     52.0
9     eight weeks                     68.1
10    eight weeks                     47.2
11    eight weeks                     59.1
12    eight weeks                     65.5
13   twelve weeks                     72.6
14   twelve weeks                     55.0
15   twelve weeks                     77.3
16   twelve weeks                     61.4
17   twelve weeks                     73.4
18   twelve weeks                     72.6
19 terminal point                     69.8
20 terminal point                     43.2

and here is the summarySE_data:

       time_frame  N     mean        sd       se        ci
1      six weeks  6       48.78333  6.922259 2.826000  7.264465
2    eight weeks  6       59.43333  8.302690 3.389559  8.713139
3   twelve weeks  6       68.71667  8.572611 3.499754  8.996404
4 terminal point 11       71.48182 20.684817 6.236707 13.896249

Solution

You can't use data$ inside aes. The data comes from the data = argument, inside aes() you should just have column names.

Here's my best guess at what will work. Can't test without seeing your data, and your group = "Sample." is very strange because it is not in aes(), it is a string, can't tell if it's a column name, so I just deleted it.

p <-
ggplot(
    df,
    aes(x = time_frame,
        y = Engraftment_Efficiency ,
        group = time_frame)
  ) +
  scale_fill_discrete(breaks = c("six weeks", "eight weeks", "twelve weeks", "terminal point")) +
  geom_bar(stat = "identity", position = position_dodge(0.9)) +
  geom_errorbar(
    data = summarySE_data,
    aes(ymax = mean + se,
        ymin = mean - se),
    position = position_dodge(0.9),
    width = 0.25
  ) +
  labs(x = NULL , y = "Engraftment Efficiency %", title = "PBL Engraftment Efficiency")

This is my best guess for what you want:

ggplot(
    summarySE_data,
    aes(x = time_frame,
        y = mean,
        group = time_frame)
  ) +
    geom_bar(stat = "identity", position = position_dodge(0.9)) +
  geom_errorbar(
    data = summarySE_data,
    aes(ymax = mean + se,
        ymin = mean - se),
    position = position_dodge(0.9),
    width = 0.25
  ) +
  labs(x = NULL , y = "Engraftment Efficiency %", title = "PBL Engraftment Efficiency")

Here we only use the summary data frame - the height of the bars is the mean, and the error bars show +/- the standard error. It doesn't seem like the plot needs the raw data frame at all.

If you do use the raw data in another layer, you may have to set inherit.aes = FALSE in the geom_errobar layer (and specify its x aesthetic), otherwise it will look for the y column in the data, not find it, and complain.