I'm trying to create a set of univariate coxph models, with categorical explanatory variables and, for each model, I would like to comapre the survival functions by level of variable. For example I'm creating:
library(survival)
set.seed(4321)
data <- data.frame(cbind(
sample(0:1, 40, replace=TRUE), # event
sample(0:24, 40, replace=TRUE), # time
sample(1:2, 40, replace=TRUE), # Diam
sample(3:4, 40, replace=TRUE) # N
))
colnames(data) <- c("event","time","Diam","N")
data$Diam <- as.factor(data$Diam)
data$N <- as.factor(data$N)
model1 <- coxph(Surv(time, event) ~ Diam, data = data)
model2 <- coxph(Surv(time, event) ~ N, data = data)
where Diam and N are have two levels. I prepared this code for plottiing the survival functions:
plot(survfit(model1, newdata = data), fun = "s",
conf.int = TRUE,
col = 2:4,
xlab = "Months",
ylab = "Proportion without event",
main = "Survival curve by Diam")
legend("bottomleft",
legend = levels(data$Diam),
lty = 1, col = c(2,4))
plot(survfit(model2, newdata = data), fun = "s",
conf.int = TRUE,
col = 2:4,
xlab = "Months",
ylab = "Proportion without event",
main = "Survival curve by N")
legend("bottomleft",
legend = levels(data$N),
lty = 1, col = c(2,4))
and this is the output:
I don't understand:
Any help would be very appreciated! Thank you in advance, Francesca
I found strange that you put three colours when drawing the plot and only two in the legend. I managed to draw it correctly with this
plot(survfit(model1, newdata = data), fun = "s",
conf.int = TRUE,
col = c(1,2,2),
xlab = "Months",
ylab = "Proportion without event",
main = "Survival curve by Diam")
legend("bottomleft",
legend = levels(data$Diam),
lty = 1, col = levels(data$Diam))
plot(survfit(model2, newdata = data), fun = "s",
conf.int = TRUE,
col = levels(data$N),
xlab = "Months",
ylab = "Proportion without event",
main = "Survival curve by N")
legend("bottomleft",
legend = levels(data$N),
lty = 1, col = levels(data$N))
But note that I kept three colours in the first plot and only two in the second. I believe the problem is with the model or the sampling.
Also, using the a different seed (1, for example) you dont't have any problems with the colours.
set.seed(1)
data <- data.frame(cbind(
sample(0:1, 40, replace=TRUE), # event
sample(0:24, 40, replace=TRUE), # time
sample(1:2, 40, replace=TRUE), # Diam
sample(3:4, 40, replace=TRUE) # N
))
colnames(data) <- c("event","time","Diam","N")
data$Diam <- as.factor(data$Diam)
data$N <- as.factor(data$N)
model1 <- coxph(Surv(time, event) ~ Diam, data = data)
model2 <- coxph(Surv(time, event) ~ N, data = data)
plot(survfit(model1, newdata = data), fun = "s",
conf.int = TRUE,
col = levels(data$Diam),
xlab = "Months",
ylab = "Proportion without event",
main = "Survival curve by Diam")
legend("bottomleft",
legend = levels(data$Diam),
lty = 1, col = levels(data$Diam))
plot(survfit(model2, newdata = data), fun = "s",
conf.int = TRUE,
col = levels(data$N),
xlab = "Months",
ylab = "Proportion without event",
main = "Survival curve by N")
legend("bottomleft",
legend = levels(data$N),
lty = 1, col = levels(data$N))
cheers,