How to run a bunch of models in R using a Function

library(survival)
justices <- read.csv("http://data.princeton.edu/pop509/justices2.csv")
PREDS = c("age", "year")
m = coxph(Surv(tenure, event == 1) ~ age + year, data = justices)
summary(m)

exp(coef(m)[1])
exp(confint(m,level=(1-0.05/1))[1,])


DVMOD <- function(PREDS, data){
  t <- coxph(paste0("Surv(tenure, event == 1) ~ "), PREDS + number + name, data = data)
  return((c(PREDS, coef(t)[1], confint(t)[1,])))
}

all_models <- lapply(PREDS,DVMOD, PREDS = PREDS, data=justices)

I wish to run separate coxph model for each variable in PREDS and then store the name of that variable with its hazard ratio and confidence bands.

Solution

If you cast the formula string with as.formula() you can change independent variables in the DVMOD() function as follows:

DVMOD <- function(PREDS, data){
     theFormula <- paste("Surv(tenure, event == 1) ~ ",PREDS," + number + name")
     t <- coxph(as.formula(theFormula), data = data)
     return((c(PREDS, coef(t)[1], confint(t)[1,])))
}
DVMOD("age",justices)
all_models <- lapply(c("age"),function(x,y){
     DVMOD(x,y)
},justices)

Since the model for year does not converge (i.e. it fails when calling DVMOD() directly with year as the value of PREDS), the lapply() with 2 variables fails, but it works with age.

...and the output:

> all_models
[[1]]
                                    age               2.5 %              97.5 % 
              "age"  "24.4400841925434" "-20.8849057264629"  "69.7650741115497" 

>