Search code examples
rformulasurvival-analysis

R formula in survfit


I am having trouble with formulas, environments, and survfit().

Things work fine for lm() but they fail for survfit().

General problem statement:

I am fitting a series of formulas to some data. So, I call the modeling function with the formula passed as a variable. Later, I want to work with the formula from the fitted object.

(From my naive point of view, the trouble comes from survfit not recording the environment.)

Detailed Example

Expected behaviour as seen in lm():

library("plyr")

preds <- c("wt", "qsec")

f <- function() {
  lm(mpg ~ wt, data = mtcars)
}

fits <- alply(preds, 1, function(pred)
{
  modform <- reformulate(pred, response = "mpg")

  lm(modform, data = mtcars)
})

fits[[1]]$call$formula
##modform
formula(fits[[1]])
## mpg ~ wt
## <environment: 0x1419d1a0>

Even though fits[[1]]$call$formula resolves to modform I can still get the original formula with formula(fits[[1]]).

But things fail for survfit():

library("plyr")
library("survival")

preds <- c("resid.ds", "rx", "ecog.ps")

fits <- 
  alply(preds, 1, function(pred)
  {
    modform <- paste("Surv(futime, fustat)", pred, sep = " ~ ")
    modform <- as.formula(modform)
    print(modform)

    fit <- survfit(modform, data = ovarian)
  })

fits[[1]]$call$formula
## modform
formula(fits[[1]])
## Error in eval(expr, envir, enclos) : object 'modform' not found

Here (and in contrast to lm-fits), formula(fits[[1]]) does not work!

So, my specific question is: How can I get back the formula used to fit fits[[1]]?


Solution

  • The issue is that when x$formula is NULL, for an lm object there is a backup plan to get the formula; this doesn't exist for survfit objects

    library("plyr")
    library("survival")
    
    preds <- c("wt", "qsec")
    f <- function() lm(mpg ~ wt, data = mtcars)
    
    fits <- alply(preds, 1, function(pred) {
      modform <- reformulate(pred, response = "mpg")
      lm(modform, data = mtcars)
    })
    
    fits[[1]]$formula
    # NULL
    

    The formula can be extracted with formula(fits[[1]]) which uses the formula generic. The lm S3 method for formula is

    stats:::formula.lm
    
    # function (x, ...) 
    # {
    #   form <- x$formula
    #   if (!is.null(form)) {
    #     form <- formula(x$terms)
    #     environment(form) <- environment(x$formula)
    #     form
    #   }
    #   else formula(x$terms)
    # }
    

    So when fits[[1]]$formula returns NULL, forumla.lm looks for a terms attribute in the object and finds the formula that way

    fits[[1]]$terms
    

    The survfit objects don't have x$formula or x$terms, so formula(x) givens an error

    preds <- c("resid.ds", "rx", "ecog.ps")
    fits <-  alply(preds, 1, function(pred) {
        modform <- paste("Surv(futime, fustat)", pred, sep = " ~ ")
        modform <- as.formula(modform)
        fit <- survfit(modform, data = ovarian)
      })
    
    fits[[1]]$formula
    # NULL
    
    formula(fits[[1]]) ## error
    
    formula(fits[[1]]$terms)
    # list()
    

    You can fix this by inserting the formula into the call and evaluating it

    modform <- as.formula(paste("Surv(futime, fustat)", 'rx', sep = " ~ "))
    substitute(survfit(modform, data = ovarian), list(modform = modform))
    # survfit(Surv(futime, fustat) ~ rx, data = ovarian)
    
    eval(substitute(survfit(modform, data = ovarian), list(modform = modform)))
    
    # Surv(futime, fustat) ~ rx
    
    # Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)
    # 
    #      n events median 0.95LCL 0.95UCL
    # rx=1 13      7    638     268      NA
    # rx=2 13      5     NA     475      NA
    

    Or by manually putting the formula into x$call$formula

    fit <- survfit(modform, data = ovarian)
    fit$call$formula
    # modform
    fit$call$formula <- modform
    fit$call$formula
    # Surv(futime, fustat) ~ rx
    
    fit
    # Call: survfit(formula = Surv(futime, fustat) ~ rx, data = ovarian)
    # 
    #      n events median 0.95LCL 0.95UCL
    # rx=1 13      7    638     268      NA
    # rx=2 13      5     NA     475      NA