I am plotting survival functions with the survival package. Everything works fine, but how do I know which curve is which? And how can I add it to a legend?
url <- "http://socserv.mcmaster.ca/jfox/Books/Companion/data/Rossi.txt"
Rossi <- read.table(url, header=TRUE)[,c(1:10)]
km <- survfit(Surv(week, arrest)~race, data=Rossi)
plot(km, lty=c(1 ,2))
how do I know which curve is which?
Using str()
you can see which elements are in km
.
km$strata
shows there are 48 and 10 elements. This coincides with the declining pattern of the first 48 items and last 10 items in km$surv
km$surv[1:48]
km$surv[49:58]
So in addition to the hint on order in print()
, with this particular dataset we can also be sure that the first 48 elements belong to race=black
And how can I add it to a legend?
Unlike other model output km
is not easily transformed to a data.frame. However, we can extract the elements ourselves and create a data.frame and then plot it ourselves.
First we create a factor referring to the strata: 48 blacks and 10 others
race <- as.factor(c(rep("black", 48), rep("other", 10)))
df <- data.frame(surv = km$surv, race = race, time = km$time)
Next we can plot it as usual (in my case, using ggplot2).
library(ggplot2)
ggplot(data = df, aes(x = time, y = surv)) +
geom_point(aes(colour = race)) +
geom_line(aes(colour = race)) +
theme_bw()