I am doing Kaplan Meier Analyses with the survival
package and need to display the concrete number of survivors for certain time periods in a Kaplan Meier plot.
For better traceability let's use the example package KMsurv
:
library(survival)
library(KMsurv)
data(tongue)
my.fit <- survfit(Surv(tongue$time,tongue$delta)~1)
pl=plot(my.fit,conf.int=FALSE)
What I need is to display the concrete number of survivors at certain points as text in the x-axis (e.g. at 50, 100, 150, 200 ...) , in this case that would be 49, 22, 11, 5...
The problem is, that summary(my.fit)
doesn't give me the number of remaining survivors at time 50, so I would need the value at the previous timestep. And this should be done for the whole interval, that I set. Here is a part of the summary for better understanding:
time n.risk n.event survival std.err lower 95% CI upper 95% CI
32 51 1 0.634 0.0541 0.5363 0.750
41 50 1 0.621 0.0545 0.5232 0.738
42 49 1 0.609 0.0549 0.5101 0.726
51 48 1 0.596 0.0552 0.4971 0.715
56 47 1 0.583 0.0554 0.4842 0.703
How can I get a list or data.frame of the number of survivors for certain time periods, the list would be c(49,22,11,5,5,5,5,5)
for 50 day steps. If I could generate that it would be included it in the plot with
text(y=0.1,x=seq(0,400,50),labels=survivorslist)
If I understood the 'tongue' data correctly, you may use the 'time' variable ("Time to death") to calculate the number of patients who dies in a given time interval (here time steps of 50) like this:
tt <- table(cut(x = tongue$time, breaks = seq(from = 0, to = 400, by = 50)))
tt
# (0,50] (50,100] (100,150] (150,200] (200,250] (250,300] (300,350] (350,400]
# 32 27 13 4 3 0 0 1
The number of survivors after each time step is then:
80 - cumsum(tt)
# (0,50] (50,100] (100,150] (150,200] (200,250] (250,300] (300,350] (350,400]
# 48 21 8 4 1 1 1 0