Search code examples
rplotlegendsurvival-analysis

R Survival Curve Plot Legend


I have a table that looks like this:

ID Survival Event Allele
2   5   1   WildType
2   0   1   WildType
3   3   1   WildType
4   38  0   Variant

I want to do a kaplan meier plot, and tell me if wild type or variants tend to survive longer.

I have this code:

library(survival)
Table <-read.table("Table1",header=T)
fit=survfit(Surv(Table$Survival,Table$Event)~Table$Allele)
plot(fit,lty=2:3,col=3:4)

From the fit p value, I can see that the survival of these two groups have significantly different survival curves.

survdiff(formula = Surv(dat$Death, dat$Event) ~ dat$Allele, rho = 0)
#                            N Observed Expected (O-E)^2/E (O-E)^2/V 
#    dat$Allele=Variant   5592     3400     3503      3.00      8.63
#    dat$Allele=WildType  3232     2056     1953      5.39      8.63
#    Chisq= 8.6  on 1 degrees of freedom, p= 0.0033

The plot looks as expected (i.e. two curves).

All I want to do is put a legend on the plot, so that I can see which data is represented by the black and red lines, i.e. do the Wild Type or Variant survive longer.

I have tried these two commands:

lab <-gsub("x=","",names(fit$strata))
legend("top",legend=lab,col=3:4,lty=2:3,horiz=FALSE,bty='n')

The first command works (i.e. I get no error). The second command, I get this error:

Error in strwidth(legend, units = "user", cex = cex, font = text.font) : plot.new has not been called yet

I've tried reading forums etc., but none of the answers seem to work for me (for example, changing between top/topright/topleft etc. doesn't matter).

Edit 1: This is an example of a table for which I get this error:

    ID Survival Event Allele
25808   5   1   WTHomo
22196   0   1   Variant
22518   3   1   Variant
25013   38  0   Variant
27354   5   1   Variant
27223   4   1   Variant
22700   5   1   Variant
22390   24  1   Variant
17586   1   1   Variant

What exactly happens is: when I type the very last command ( legend("top",legend=lab,col=3:4,lty=2:3,horiz=FALSE,bty='n')), the XII window opens, except it's completely blank.

But then if you just type "plot(fit,lty=2:3,col=3:4)", the XII window and the plot appear.

Edit 2: Also, this graph will have two lines, how do I tell which line is which variable? Would the easiest way to do this be to type summary(fit) which gives me two tables. Then, whichever variable comes first in the table, I put in first in the legend?

Many thanks Eva


Solution

  • You can also do this using ggsurvplot() from survminer. Here is an example

    library(survminer) # Contains ggsurvplot()
    library(survival) # Contains survfit()
    ggsurvplot(
      fit=survfit(Surv(time, censor) ~ Allele, data=your_data,type="kaplan-meier"), # Model
      xlab="Years",
      ylab="Overall survival probability",
      legend.labs=c("WildType","Variant"), # Assign names to groups which are shown in the plot
      conf.int = T, # Adds a 95%-confidence interval
      pval = T, # Displays the P-value in the plot
      pval.method = T # Shows the statistical method used for obtaining the P-value
    )