When I do cox regression using R, I found the proporitonal hazard assumption is violated. So I decided to divide the data into different time intervals using the 'survSplit' function and calculate the HR for each time interval.
For example:
`D1 <- survSplit(Surv(time, status) ~ ., data= data1, cut=c(40, 60),
episode= "tgroup", id="id")`
`coxph(Surv(tstart, time, status) ~ age + class:strata(tgroup), data=D1)`
Now I want to do the same for multiply-imputed datasets (n=5).
imp <- mice(data1, seed = 12345, n=5,print = FALSE)
fit <- with(imp, coxph(Surv(tstart, time, status) ~ age + class:strata(tgroup))
`summary(pool(fit))`
My question is how I can post process the imputed data to divide each imputed dataset into different time intervals using 'survSplit' between step 1 and step 2. After post-processing, I can continue to do step 2 and step3.
Although I can extract each imputed dataset and then divide it using 'survSplit', I am not able to continue step2 and step 3 using the MICE package. Kindly help. Thanks.
In this case the best approach is to use {}
inside the with()
command, which will allow you to place multiple statements inside. Here's a demonstration, based on the example in the survSplit()
documentation.
library(survival)
library(mice)
set.seed(123)
# use the veteran dataset with some values made missing
veteran2 <- veteran
veteran2[which(rbinom(nrow(veteran2),1,0.1) == 1),
"karno"] <- NA
# make imputations
imps <- mice(veteran2, printFlag = FALSE)
# fit the models, with survSplit used inside each call
mods <- with(imps,
{
dat <- data.frame(mget(ls()))
dat_split <- survSplit(Surv(time, status) ~ .,
data = dat,
cut = c(60, 120),
episode = "timegroup")
coxph(Surv(tstart, time, status) ~ karno*strata(timegroup),
data = dat_split)
})
pooled <- pool(mods)
summary(pooled)
#> term estimate std.error statistic
#> 1 karno -0.04986785 0.007715564 -6.463279
#> 2 karno:strata(timegroup)timegroup=2 0.03645985 0.015247181 2.391252
#> 3 karno:strata(timegroup)timegroup=3 0.04104428 0.013389159 3.065486
#> df p.value
#> 1 70.51041 1.141572e-08
#> 2 94.68223 1.876833e-02
#> 3 108.83144 2.740870e-03
c(t0_60 = pooled$pooled$estimate[1],
t60_120 = sum(pooled$pooled$estimate[c(1,2)]),
t_120 = sum(pooled$pooled$estimate[c(1,3)]))
#> t0_60 t60_120 t_120
#> -0.049867847 -0.013407999 -0.008823565
Created on 2022-08-24 by the reprex package (v2.0.1)