Please consider the following:
To create a survival curve one can make use of the survfit
function of the survival
package.
My aim is to write a function that (amongst other things) creates such a curve, but the function should work with different data.frames
whose column names also differ. Also the grouping variable will be dependant on the respective dataset.
I manage to pass different data.frame
names to the function but providing the column name for the survfit
and Surv
function does not work for me.
Any help is greatly appreciated.
In my opinion this is a different problem from simply passing a data.frame
column name to a function as discussed here: Pass a data.frame column name to a function
# required libraries
library(survival)
library(flexsurv)
#### Examples that work without own function ===================================
# survfit wit lung data
survfit(Surv(time = time, event = status) ~ 1, data = lung)
#> Call: survfit(formula = Surv(time = time, event = status) ~ 1, data = lung)
#>
#> n events median 0.95LCL 0.95UCL
#> 228 165 310 285 363
survfit(Surv(time = time, event = status) ~ sex, data = lung)
#> Call: survfit(formula = Surv(time = time, event = status) ~ sex, data = lung)
#>
#> n events median 0.95LCL 0.95UCL
#> sex=1 138 112 270 212 310
#> sex=2 90 53 426 348 550
# survfit with bc data
survfit(Surv(time = rectime, event = censrec) ~ 1, data = bc)
#> Call: survfit(formula = Surv(time = rectime, event = censrec) ~ 1,
#> data = bc)
#>
#> n events median 0.95LCL 0.95UCL
#> 686 299 1807 1587 2030
# Create variable function that takes on data specific arguments
SurvFun <- function(fun.time, fun.event, grouping = 1, fun.dat){
survfit(Surv(time = fun.time, event = fun.event) ~ grouping, data = fun.dat)
}
#### Own function that doesn't work ============================================
# This should work for data = lung
SurvFun(fun.time = time, fun.event = status, grouping = 1, fun.dat = lung)
#> Error in Surv(time = fun.time, event = fun.event): Time variable is not numeric
Created on 2018-07-05 by the reprex package (v0.2.0).
When column names aren't surrounded by quotes, they are being passed as symbols. It's much harder to pass around symbols than simple variables. This applies to formulas as well. You need to do some meta-programming for that to work. Here's one way to re-write your function to work
SurvFun <- function(fun.time, fun.event, grouping = 1, fun.dat) {
params <- list(fun.time = substitute(fun.time),
fun.event = substitute(fun.event),
grouping = substitute(grouping),
fun.dat = substitute(fun.data))
expr <- substitute(survfit(Surv(time = fun.time, event = fun.event) ~ grouping,
data = fun.dat), params)
eval.parent(expr)
}