Kaplan Meier I am working in R markdown
I would like to know if I am calculating the survival time correct. My data is MOWHTO_COMPLICATIONS
My data has the following variables D_SURGERY (Date of surgery), REV_ARTHROPLASTY (Date of revision), SENSOR_STATUS ( Which is either 0 = censored or 1 = revised).
The date of revision has dates for the revised cases only. The remaining cells are empty.
I calculated the survival time in years using the following code:
MOWHTO_COMPLICATIONS$SURVIVAL_TIME_YEARS = as.numeric(difftime(MOWHTO_COMPLICATIONS$REV_ARTHROPLASTY, MOWHTO_COMPLICATIONS$D_SURGERY, units = "weeks"))/52.25
Then created the curve using the following code:
survfit2(Surv(SURVIVAL_TIME_YEARS, CENSOR_STATUS) ~ 1, data = MOWHTO_COMPLICATIONS) %>%
ggsurvfit() +
labs(
x = "Years",
y = "Overall Survival Probability"
)+
add_confidence_interval()+
add_risktable()
Then I want to see the survival at 10 years and used the following code:
summary(survfit(Surv(SURVIVAL_TIME_YEARS, CENSOR_STATUS) ~ 1, data = MOWHTO_COMPLICATIONS), times = 10)
And I had the following results Call: survfit(formula = Surv(SURVIVAL_TIME_YEARS, CENSOR_STATUS) ~ 1, data = MOWHTO_COMPLICATIONS)
678 observations deleted due to missingness time n.risk n.event survival std.err lower 95% CI 10 8 55 0.127 0.0419 0.0665 upper 95% CI 0.243
So, the survival is 0.127 that is 13% at 10 years. It can not be correct. It should be over 80 or 90%?
What am I doing wrong? Is it the survival time? Should I have dates for the cases that is not revised? And what dates should be?
Any help would be very much appreciated.
The empty columns need to have a date in it or when you take the difftime, the result will be empty.
For example:
as.numeric(difftime(ISOdate(2001, 4, 26), NA, units = "weeks"))/52.25
returns NA
Instead fill in the last date the subject was observed when the event of interest did not occur. That way when you subtract the dates, you will get a value for time.
There is a hint in this part of the output:
678 observations deleted due to missingness time
They deleted those records because they didn't have a time value.