Okay. So this is a similar question to one I posted before for which I still have no satisfactory solution.
As you will see below, I have used some example data to build a Cox PH model which is then passed to a custom function containing a for-loop and a function, predictSurvProb
from pec
, which ultimately populates a pre-allocated empty vector prediction
.
I have tried using cmpfun
from compiler
. However, there is no improvement in performance. Am I destined to just live with this slow processing speed or are there any ways that I can speed up the processing? I cannot code in C++ so Rcpp
is not an option I'd imagine.
Thanks.
require(dplyr, survival, pec)
cox_model <- coxph(Surv(time, status) ~ sex, data = lung)
surv_preds <- function(model, query) {
prediction <- vector(mode = "numeric", length = nrow(query))
time <- 30
for(i in 1:nrow(query)) {
prediction[i] <- predictSurvProb(model, newdata = query[i, ], times = query[i, "time"] + time)
}
prediction
}
surv_preds(cox_model, lung)
After exhausting myself with attempts at C++ and parallel computing efforts, I have found that simply converting factor variables to integers has significantly improved my application. The improvement reduces the processing time to about 60 hours from almost a week!
I suspect that this is still not the most efficient solution but it will have to do for now.