Search code examples
rggplot2survival-analysis

How to extract summary() to a data frame applicable for data visualization in ggplot()?


I am doing survival-analysis with the presence of competing risks. I use the prodlim-package, which I find quite useful. However, I do not like the build-in graphics, and would like to apply ggplot instead.

Question: how can I extract the prodlim summary()-output and load it into a data frame accessible to ggplot2?

Perhaps a function can be written to do this? I have previously received help on StackOverflow in terms of loading a summary()-output into a dataframe, but with a different package than prodlim.

library(prodlim)
library(riskRegression)

# Build-in data
data(Melanoma)

# Simple competing risk analysis
fit.aj <- prodlim(Hist(time,status)~age+sex,data=Melanoma)

# Defining the event of interest as cause=1
summary(fit.aj,conf.int=FALSE,newdata=data.frame(age=50,sex="Male"),cause=1)

Which yields

> summary(fit.aj,conf.int=FALSE,newdata=data.frame(age=50,sex="Male"),cause=1)

----------> Cause:  1 

age=50, sex=Male :
  time n.risk n.event n.lost cuminc se.cuminc lower upper
1   10     33       0      0  0.000    0.0000 0.000 0.000
2 1513     22       0      0  0.282    0.0795 0.126 0.437
3 2006     15       0      0  0.282    0.0795 0.126 0.437
4 3042     10       0      0  0.396    0.0998 0.201 0.592
5 5565      0       0      0     NA        NA    NA    NA

The summary can easily be plotted; however, the graphics are not my style.

plot(fit.aj,conf.int=FALSE,newdata=data.frame(age=50,sex="Male"),cause=1)

enter image description here

I would like to convert the summary() into a dataframe (lets called it df), which can be included in ggplot().

Based on the names in summary(), I am trying to achieve something like:

ggplot(df, aes(x=time)) +
geom_line(aes(y=cuminc) +
geom_ribbon(aes(ymin = lower, ymax = upper)

Thanks for your suggestions.


Solution

  • You can access the resulting list up to the level where the table for the plot is found and save it as a data.frame. You can check the structure of the list by using str.

    summ_list <- summary(fit.aj,conf.int=FALSE,newdata=data.frame(age=50,sex="Male"),cause=1)
    df<-as.data.frame(summ_list $table$`1`$`age=50, sex=Male`)
    
    #The desired plot with ggplot
    ggplot(df, aes(x=time)) +
      geom_line(aes(y=cuminc)) +
      geom_ribbon(aes(ymin = lower, 
                                  ymax = upper))
    

    ggplot