I'm trying to find an efficient way to create a dataset that contains the original data, coefficient estimates, model fit, and fitted observations from a coxph()
survival model. Currently, my code looks like this:
ex_model <- lung%>% #lung dataset from survival package
nest(-sex)%>%
mutate(fit = map(data, ~
coxph(Surv(time, status) ~ age +
wt.loss +
meal.cal,
data = .)),
glance = map(fit, glance),
tidy = map(fit, tidy)))%>%
glimpse()
# output
Columns: 5
# $ sex <dbl> 1, 2
# $ data <list> [<tbl_df[138 x 9]>], [<tbl_df[90 x 9]>]
# $ fit <list> [0.0205745842, 0.0047088643, -0.0001776546, 1.962976e-04, -3.220915e-06, 1.011575e-06, -3.220915e-06, 5.999414e-05, -1.525540e-07, 1.011575e-06, -1.~
# $ glance <list> [<tbl_df[1 x 18]>], [<tbl_df[1 x 18]>]
# $ tidy <list> [<tbl_df[3 x 5]>], [<tbl_df[3 x 5]>]
which gives me a dataframe with a column for the nest variable (sex
), and four list columns data
, fit
, glance
, and tidy
. I would like to add a column augment
containing the fittedd values for each observation but have been unsucessful mapping the augment function to fit
.
Here is an example of code which generates my desired output using lm()
instead of coxph()
ex_model <- lung%>% #lung dataset from survival package
nest(-sex)%>%
mutate(fit = map(data, ~
lm(status ~ age +
wt.loss +
meal.cal,
data = .)),
glance = map(fit, glance),
tidy = map(fit, tidy),
augment = map(fit, augment))%>%
glimpse()
# output
# $ sex <dbl> 1, 2
# $ data <list> [<tbl_df[138 x 9]>], [<tbl_df[90 x 9]>]
# $ fit <list> [1.415301e+00, 7.049341e-03, 6.981800e-04, -7.171368e-05, 0.18272053, 0.25767747, -0.90016296, 0.25907699, 0.15038564, 0.23754238, 0.22724620, 0.32~
# $ glance <list> [<tbl_df[1 x 12]>], [<tbl_df[1 x 12]>]
# $ tidy <list> [<tbl_df[4 x 5]>], [<tbl_df[4 x 5]>]
# $ augment <list> [<tbl_df[106 x 11]>], [<tbl_df[65 x 11]>]
When I use the mutate(augment = map(fit, augment))%>%
syntax with coxph()
, RStudio returns an error:
Did you want `data = c(inst, time, status, age, ph.ecog, ph.karno, pat.karno, meal.cal,
wt.loss)`?Error: Problem with `mutate()` input `augment`.
x Must specify either `data` or `newdata` argument.
i Input `augment` is `map(fit, augment)`.
Is this a problem with my syntax, or is there a more fundamental reason I can't augment(fit)
here? What is the most efficient way around this issue?
The error you want to focus on is
x Must specify either `data` or `newdata` argument.
This error comes from augment()
and it says that to use augment you have to pass in data in addition to the model. So you need to do augment(model_fit, newdata = my_new_data)
. For more information on augment.coxph()
look here.
library(tidyverse)
library(survival)
library(broom)
ex_models <- lung %>%
nest(data = c(inst, time, status, age, ph.ecog, ph.karno, pat.karno, meal.cal, wt.loss)) %>%
mutate(fit = map(data, ~ coxph(Surv(time, status) ~ age + wt.loss + meal.cal, data = .)),
glance = map(fit, glance),
tidy = map(fit, tidy),
augment = map(fit, augment, newdata = lung))
glimpse(ex_models)
#> Rows: 2
#> Columns: 6
#> $ sex <dbl> 1, 2
#> $ data <list> [<tbl_df[138 x 9]>], [<tbl_df[90 x 9]>]
#> $ fit <list> [0.0205745842, 0.0047088643, -0.0001776546, 1.962976e-04, -3.2…
#> $ glance <list> [<tbl_df[1 x 18]>], [<tbl_df[1 x 18]>]
#> $ tidy <list> [<tbl_df[3 x 5]>], [<tbl_df[3 x 5]>]
#> $ augment <list> [<tbl_df[228 x 12]>], [<tbl_df[228 x 12]>]
Created on 2021-08-14 by the reprex package (v2.0.1)