Search code examples
rggplot2forest-plots

How would one (1) limit the x-axis of a forest plot whilst adding an arrow for error bars that exceed it, and (2) have a log2 transformed x-axis?


There are two problems:

(1) The forest plot I'm trying to make with ggplot2 would ideally be limited between 0 and 16 in the x-axis. I have tried adding arrows to points whose error bars go beyond the set limit, but their error bars that point in the opposite direction disappear completely.

(2) The x-axis is an odds ratio, which I would like to be log2 transformed. I would expect to see 1, 2, 4, 8, 16. But what I currently see is 0, 5, 10, 15. It may well be that the 1st problem is interfering with ggplot's ability to do this.

The plot I currently have

Here is some data and the code I used for you, the kind and helpful reader, to reproduce the problem.

library(tidyverse, gt, scales)

data <- tibble('SNP'=c('Univariable CM','1','2','3','4','5','6','7','8','9'),
               'OR'=c(10.13, 5.16, 2.83, 5.92, 4.71, 6.22, 3.79, 6.64, 7.80, 7.88),
               `lower OR`=c(5.3, 1.8, 1.6, 2.8, 2.3, 2.7, 1.3, 3.5, 3.8, 3.9),
               `upper OR`=c(19,14,5,12,7,14,11,13,18,17))

data$OR_u <- ifelse(data$`upper OR` > 16, 16, NA_real_) # for the limit x=16

data |>
  ggplot(aes(y = fct_rev(fct_relevel(SNP, "Univariable CM")))) +# fct_rev reverses the order of factor levels, A at top, J at bottom
  theme_classic() +
  geom_point(aes(x=OR, color=OR>0), shape=15, size=3, position = position_dodge(0.5)) + # estimates
  geom_linerange(aes(xmin=`lower OR`, xmax=`upper OR`), position = position_dodge(0.5)) + # CIs
  geom_vline(xintercept = 1, linetype="dashed") + # OR so x=1
  scale_x_continuous(trans='log2',
                     breaks=c(1,2,4,8,16),
                     labels=trans_format('log2', math_format(2^.x))) +
  xlim(0, 16) +
  labs(x="OR", y="") +# x-axis renamed
  theme(axis.line.y = element_blank(),
        axis.ticks.y= element_blank(),
        axis.text.y= element_blank(),
        axis.title.y= element_blank(),
        legend.position = 'none') +
  geom_segment(
    aes(x = OR, xend = OR_u, y = fct_rev(fct_relevel(SNP, "Univariable CM")), yend = fct_rev(fct_relevel(SNP, "Univariable CM"))),
    position = position_dodge(0.5), arrow = arrow(length = unit(0.3, "cm")))

Thank you so very much in advance!


Solution

  • To check, does this match what you're aiming to do? The small fixes:

    • Adding xlim truncates things that are outside of plot area. coord_cartesian(xlim = ...) plots 'up to' the limits.
    • The limits here can't start at 0 as in a log transformed scale, 0 is an infinite distance from 20, which represents '1' ('0' on the other hand is 2-Inf). Instead use a fraction (here 0.8) as a moderate distance.
    library(tidyverse)
    library(scales)
    
    data <- tibble('SNP'=c('Univariable CM','1','2','3','4','5','6','7','8','9'),
                   'OR'=c(10.13, 5.16, 2.83, 5.92, 4.71, 6.22, 3.79, 6.64, 7.80, 7.88),
                   `lower OR`=c(5.3, 1.8, 1.6, 2.8, 2.3, 2.7, 1.3, 3.5, 3.8, 3.9),
                   `upper OR`=c(19,14,5,12,7,14,11,13,18,17))
    
    data$OR_u <- ifelse(data$`upper OR` > 16, 16, NA_real_) # for the limit x=16
    
    data |>
      ggplot(aes(y = fct_rev(fct_relevel(SNP, "Univariable CM")))) +# fct_rev reverses the order of factor levels, A at top, J at bottom
      theme_classic() +
      geom_point(aes(x=OR, color=OR>0), shape=15, size=3, position = position_dodge(0.5)) + # estimates
      geom_linerange(aes(xmin=`lower OR`, xmax=`upper OR`), position = position_dodge(0.5)) + # CIs
      geom_vline(xintercept = 1, linetype="dashed") + # OR so x=1
      scale_x_continuous(trans='log2',
                         breaks=c(1,2,4,8,16),
                         labels=trans_format('log2', math_format(2^.x))) +
      coord_cartesian(expand = FALSE, xlim = c(0.8, 16)) +
      labs(x="OR", y="") +# x-axis renamed
      theme(axis.line.y = element_blank(),
            axis.ticks.y= element_blank(),
            axis.text.y= element_blank(),
            axis.title.y= element_blank(),
            legend.position = 'none') +
      geom_segment(
        aes(x = OR, xend = OR_u, y = fct_rev(fct_relevel(SNP, "Univariable CM")), yend = fct_rev(fct_relevel(SNP, "Univariable CM"))),
        position = position_dodge(0.5), arrow = arrow(length = unit(0.3, "cm")))