Search code examples
rperformancefor-loopfunctional-programmingsurvival-analysis

Accelerate for-loop containing coxph function call


Okay. So this is a similar question to one I posted before for which I still have no satisfactory solution.

As you will see below, I have used some example data to build a Cox PH model which is then passed to a custom function containing a for-loop and a function, predictSurvProb from pec, which ultimately populates a pre-allocated empty vector prediction.

I have tried using cmpfun from compiler. However, there is no improvement in performance. Am I destined to just live with this slow processing speed or are there any ways that I can speed up the processing? I cannot code in C++ so Rcpp is not an option I'd imagine.

Thanks.

require(dplyr, survival, pec)

cox_model <- coxph(Surv(time, status) ~ sex, data = lung)

surv_preds <- function(model, query) {

  prediction <- vector(mode = "numeric", length = nrow(query))
  time <- 30

  for(i in 1:nrow(query)) {
    prediction[i] <- predictSurvProb(model, newdata = query[i, ], times = query[i, "time"] + time)
  }
  prediction
}

surv_preds(cox_model, lung)

Solution

  • After exhausting myself with attempts at C++ and parallel computing efforts, I have found that simply converting factor variables to integers has significantly improved my application. The improvement reduces the processing time to about 60 hours from almost a week!

    I suspect that this is still not the most efficient solution but it will have to do for now.