Search code examples
rggplot2survival-analysisggsurvfit

Showing survival probability in % on y-axis for different time-points


I'm using ggsurvfit to create a Kaplan–Meier estimator for survival probability. I know that I can add quantiles for different time points by using e.g. add_quantile(x_value = 5). However, I would like to show the survival probability for those quantiles also as a number on the y-axis in percent. Is there any possibility to do this? I have looked up into the ggsurvfit examples and also on stack overflow but have not found anything.

survfit2(Surv(time, status) ~ surg, data = df_colon) %>%
ggsurvfit(linewidth = 0.8) +
add_censor_mark(size = 2, alpha = 0.2) +
add_quantile(x_value = 5,  color = "grey30", linewidth = 0.8) +
scale_ggsurvfit()

Solution

  • The data that you're looking for is in the 3rd layer, so we can extract that and add it to the breaks of a scale_*_continous()

    library(ggsurvfit)
    
    s <- survfit2(Surv(time, status) ~ surg, data = df_colon) %>%
      ggsurvfit(linewidth = 0.8) +
      add_censor_mark(size = 2, alpha = 0.2) +
      add_quantile(x_value = 2.718282,
                   color = "grey30",
                   linewidth = 0.8)
    
    s +
      scale_y_continuous(
        limits = c(0, 1),
        labels = scales::label_percent(),
        breaks = c(s[["layers"]][[3]][["data"]][["y"]], seq(0, 1, 0.2))
      ) +
      scale_x_continuous(
        labels = scales::label_number(accuracy = 1),
        breaks = c(
          s[["layers"]][[3]][["data"]][["x"]][[1]], 
          seq(floor(min( df_colon$time)), 
              ceiling(max(df_colon$time )), 2)))