Search code examples
rsurvival-analysis

Calculate number of survivors in a KM plot in certain time intervals


I am doing Kaplan Meier Analyses with the survival package and need to display the concrete number of survivors for certain time periods in a Kaplan Meier plot.

For better traceability let's use the example package KMsurv:

library(survival)
library(KMsurv)
data(tongue)
my.fit <- survfit(Surv(tongue$time,tongue$delta)~1)
pl=plot(my.fit,conf.int=FALSE)

What I need is to display the concrete number of survivors at certain points as text in the x-axis (e.g. at 50, 100, 150, 200 ...) , in this case that would be 49, 22, 11, 5...

The problem is, that summary(my.fit) doesn't give me the number of remaining survivors at time 50, so I would need the value at the previous timestep. And this should be done for the whole interval, that I set. Here is a part of the summary for better understanding:

time n.risk n.event survival std.err lower 95% CI upper 95% CI    
32     51       1    0.634  0.0541       0.5363        0.750
41     50       1    0.621  0.0545       0.5232        0.738
42     49       1    0.609  0.0549       0.5101        0.726
51     48       1    0.596  0.0552       0.4971        0.715
56     47       1    0.583  0.0554       0.4842        0.703

How can I get a list or data.frame of the number of survivors for certain time periods, the list would be c(49,22,11,5,5,5,5,5) for 50 day steps. If I could generate that it would be included it in the plot with

text(y=0.1,x=seq(0,400,50),labels=survivorslist)

Solution

  • If I understood the 'tongue' data correctly, you may use the 'time' variable ("Time to death") to calculate the number of patients who dies in a given time interval (here time steps of 50) like this:

    tt <- table(cut(x = tongue$time, breaks = seq(from = 0, to = 400, by = 50)))
    tt
    # (0,50]  (50,100] (100,150] (150,200] (200,250] (250,300] (300,350] (350,400] 
    #     32        27        13         4         3         0         0         1
    

    The number of survivors after each time step is then:

    80 - cumsum(tt)
    # (0,50]  (50,100] (100,150] (150,200] (200,250] (250,300] (300,350] (350,400] 
    #     48        21         8         4         1         1         1         0