Search code examples
rlapplysapplymapplysurvival

How to run a bunch of models in R using a Function


library(survival)
justices <- read.csv("http://data.princeton.edu/pop509/justices2.csv")
PREDS = c("age", "year")
m = coxph(Surv(tenure, event == 1) ~ age + year, data = justices)
summary(m)

exp(coef(m)[1])
exp(confint(m,level=(1-0.05/1))[1,])


DVMOD <- function(PREDS, data){
  t <- coxph(paste0("Surv(tenure, event == 1) ~ "), PREDS + number + name, data = data)
  return((c(PREDS, coef(t)[1], confint(t)[1,])))
}

all_models <- lapply(PREDS,DVMOD, PREDS = PREDS, data=justices)

I wish to run separate coxph model for each variable in PREDS and then store the name of that variable with its hazard ratio and confidence bands.


Solution

  • If you cast the formula string with as.formula() you can change independent variables in the DVMOD() function as follows:

    DVMOD <- function(PREDS, data){
         theFormula <- paste("Surv(tenure, event == 1) ~ ",PREDS," + number + name")
         t <- coxph(as.formula(theFormula), data = data)
         return((c(PREDS, coef(t)[1], confint(t)[1,])))
    }
    DVMOD("age",justices)
    all_models <- lapply(c("age"),function(x,y){
         DVMOD(x,y)
    },justices)
    

    Since the model for year does not converge (i.e. it fails when calling DVMOD() directly with year as the value of PREDS), the lapply() with 2 variables fails, but it works with age.

    ...and the output:

    > all_models
    [[1]]
                                        age               2.5 %              97.5 % 
                  "age"  "24.4400841925434" "-20.8849057264629"  "69.7650741115497" 
    
    >