Search code examples
rggplot2rangelimit

Let data reach limits in ggplot instead of going NA


I'm attempting to plot some standard error (SE) bars using ggplot2. In this set-up, I have thick bars displaying typical SE bars, but on top of those bars, I overlay thin bars showing "alternative" SEs (which are under the heading "se2" in the data). These alternative SE bars are always larger than the data.

The issue that I'm running into is that the large alternative SEs get removed, with the warning message telling me that 2 rows are removed since they were containing missing values. What I would like is simply for these values to be displayed anyway, where if the alternative SE bar reaches the limit I've set, then it stops there, still showing up (with the implication for the reader then that it continues past).

I've posted a simplified version of what I'm working with:

# Load packages
library(dplyr)
library(ggplot2)
library(ggpubr)

# Make dataframe for group 1

df_values1 <- data.frame(
  beta = c(0.07,0.04,0.3),
  se = c(.01,0.01,0.008),
  se2 = c(0.1,0.05,0.2),
  outcome = c("Name 1",
          "Name 2",
          "Name 3"),
  sample = c(rep("Group1",3))
)

# Make dataframe for group 2

df_values2 <- data.frame(
  beta = c(0.15,-0.04,0.03),
  se = c(.01,0.01,0.008),
  se2 = c(0.1,.2,0.05),
  outcome = c("Name 1",
          "Name 2",
          "Name 3"),
  sample = c(rep("Group2",3))
)

# Make dataframe for group 3

df_values3 <- data.frame(
  beta = c(0.22,0.18,-0.03),
  se = c(.01,0.01,0.008),
  se2 = c(1,0.05,0.01),
  outcome = c("Name 1",
          "Name 2",
          "Name 3"),
  sample = c(rep("Group3",3))
)

# Position dodge
pd <- position_dodge(0.7)

# Merge datasets
df_all <- rbind(df_values1, df_values2, df_values3)

# NOTE: use the levels of outcome from one of the non-merged datasets
df_all$outcome <- factor(df_all$outcome, levels = df_values1$outcome)

# Because the coordinates will be flipped, the order of the levels is 'reversed' here
df_all$sample <- factor(df_all$sample, levels = c('Group3', 'Group2', 'Group1'))


# Plot
picture <- ggplot(df_all, aes(x = outcome, y = beta, group = sample, colour = sample)) + 
  geom_hline(yintercept = c(-0.375, -0.125, 0.125, 0.375), size = 0.25, colour = 'grey95') +
  geom_errorbar(aes(ymin = beta-1.96*se, ymax = beta+1.96*se), width = 0, alpha = 1, size = 2, position = pd) +
  geom_errorbar(aes(ymin = beta-1.96*se2, ymax = beta+1.96*se2), width = 0, alpha = 1, size = 0.5, position = pd) +
  geom_hline(yintercept = 0, size = 0.25) + 
  guides(colour = guide_legend(reverse = TRUE), shape = guide_legend(reverse = TRUE)) + 
  ylim(-0.5,0.5) + 
  coord_flip() + 
  scale_x_discrete(limits = rev(levels(df_all$outcome)))

picture

Here is the picture of the result

I'm hoping there's a solution that will accommodate both situations in the example above: 1) the pink alternative SEs for "Name 1" are too large, so ideally they would be from end-to-end of the graph; 2) the blue alternative SEs for "Name 3" are too large on the right but on the left should stop within the plot. So on the left it stops in the plot but on the right continues until it hits the limit. Thanks!


Solution

  • See both answers here: How to set limits for axes in ggplot2 R plots? Normally coord_cartesian is used to prevent data being clipped but if you are using coord_flip then limits can be set within this:

    picture <- ggplot(df_all, aes(x = outcome, y = beta, group = sample, colour = sample)) + 
         geom_hline(yintercept = c(-0.375, -0.125, 0.125, 0.375), size = 0.25, colour = 'grey95') +
         geom_errorbar(aes(ymin = beta-1.96*se, ymax = beta+1.96*se), width = 0, alpha = 1, size = 2, position = pd) +
         geom_errorbar(aes(ymin = beta-1.96*se2, ymax = beta+1.96*se2), width = 0, alpha = 1, size = 0.5, position = pd) +
         geom_hline(yintercept = 0, size = 0.25) + 
         guides(colour = guide_legend(reverse = TRUE), shape = guide_legend(reverse = TRUE)) + 
         coord_flip(ylim = c(-0.5,0.5)) + 
         scale_x_discrete(limits = rev(levels(df_all$outcome)))