I am hoping someone can provide some help with an analysis I have.
Context: I have a dataset with patients who had liver transplant (some of them are in treatment group A, others in treatment group B, depending on the immunosuppressive med they’re on). Due to the transplant, these patients are at high risk for developing donor derived HBV infections.
What’s needed: The investigator is interested in time until infection starts (first instance of HBV) and the proportion that develop the infection over time. They also want the cumulative incidence of HBV infection at baseline, and at each of the post-transplant follow-up timepoints (6 months, 12 months, 18 months and 24 months) for group A and group B. For example, the 6-months data would be the proportion of those patients with 6-months follow-up who ever had HBV, the 12-months data would the proportion of those with 12-months data follow-up who ever had HBV and so on.
Cumulative incidence in this specific case just means 1 minus the survival function, without accounting for any competing risks. The analysis population has no deaths or loss to follow up.
My questions are:
Below is a test dataset similar to the actual one, and what I have done so far:
time<-c(1.5989,6.9433, 0.8890, 3.2691, 1.0514, 2.7625, 1.4319, 0.9681, 7.4416, 0.0268, 1.5168, 1.9647, 0.0657, 4.3571, 6.4490, 0.2198, 1.2028, 0.9555, 0.2601, 2.0096, 7.5156, 0.4463, 0.2355, 0.9391, 2.6996)
censor<-c(1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0)
group<-c(1, 2, 1, 1, 2, 2, 1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 2, 1, 1, 2, 2, 2, 1, 2, 1)
df<-data.frame(ID, time, censor, group)
View(df)
km<-survfit(formula = Surv(time, censor) ~ group, data = df)
summary(km)
#cumulative incidence plot
plot(km, fun = function(x) 1-x)
#log rank test;
survdiff(Surv(time, censor) ~ group, data=df)
#plot survival curves for each treatment group
plot(survfit(Surv(time, censor) ~ group, data = df),
xlab = "Time",
ylab = "Overall survival probability")
It is better to use survminer
package to draw survival plots by its ggsurvplot()
function.
fun
argument of ggsurvplot()
can be "event"
for cumulative incidence or "cumhaz"
for cumulative hazard function, or if left by default, would plot Kaplan Meier
curve.
Additional arguments such as pvalue also can be customized.
pvalue method is kaplan meier by default.
You can see more examples in survminer
documentations.
library(survminer)
#> Loading required package: ggplot2
#> Loading required package: ggpubr
time <- c(1.5989,6.9433, 0.8890, 3.2691, 1.0514, 2.7625, 1.4319, 0.9681, 7.4416,
0.0268, 1.5168, 1.9647, 0.0657, 4.3571, 6.4490, 0.2198, 1.2028, 0.9555,
0.2601, 2.0096, 7.5156, 0.4463, 0.2355, 0.9391, 2.6996)
censor <- c(1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0)
group <- c(1, 2, 1, 1, 2, 2, 1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 2, 1, 1, 2, 2, 2, 1, 2, 1)
df <- data.frame(time, censor, group)
km <- survfit(formula = Surv(time, censor) ~ group, data = df)
#cumulative incidence plot
ggsurvplot(km,
data = df,
censor = T,
risk.table = TRUE,
legend.labs = c("group 1", "group 2"),
xlim = c(0,10),
ylim = c(0,1),
pval = T,
pval.method = T,
pval.method.coord = c(2.5,0.5),
pval.coord = c(4.2,0.5),
xlab = "Months",
ylab = "SURVIVAL PROBABILITY",
linetype = c(1,2),
legend.title = "",
palette = c('red', 'blue'),
fun="event"
)
Created on 2023-02-12 with reprex v2.0.2