Search code examples
rcox

extract log rank (score) test result wiht p-value for Coxph Model


I have 100 replicates of coxph model fitted in loop. I am trying to extract out log-rank score test result with p-values for each replicate in a data frame or list. I am using the following. But, it gives me only log rank score, not p-value. Any help will be very appreciated.

I can share dataset, but am not sure how to attach here.

thanks, Krina

Repl_List <- unique(dat3$Repl)
doLogRank = function(sel_name) {
dum <- dat3[dat3$Repl == sel_name,]
reg <- with(dum, coxph(Surv(TIME_day, STATUS) ~ Treatment, ties = "breslow"))
LogRank <- with(reg, reg$score) 
}
LogRank <- t(as.data.frame(lapply(Repl_List, doLogRank)))

Solution

  • Here is a mock example that I took from the help page of the coxph function. I just replicated the dataset 100 times to create your scenario. I highly recommend to start using the tidyverse packages to do such work. broom is a great addition along with dplyr and tidyr.

    library(survival)
    library(tidyverse)
    library(broom)
      test <- data.frame(time=c(4,3,1,1,2,2,3), 
                  status=c(1,1,1,0,1,1,0), 
                  x=c(0,2,1,1,1,0,0), 
                  sex=c(0,0,0,0,1,1,1))
    

    Below I am replicating the dataset 100 times using the replicate function.

    r <- replicate(test,n = 100,simplify = FALSE) %>% bind_rows %>% 
      mutate(rep = rep(seq(1,100,1),each=7))
    

    I setup the cox model as a small function that I can them pass on to each replicate of the dataframe.

    cxph_mod <- function(df) {
      coxph(Surv(time, status) ~ x + strata(sex), df)
    }
    

    Below, is the step by step process of fitting the model and extracting the values.

    tidyr::nest the dataframe purrr::map the model into each nest nest is function in library(tidyr) map is a function similar to lapply in library(purrr)

    nested <- r %>% 
      group_by(rep) %>% 
      nest %>% 
      mutate(model = data %>% map(cxph_mod))
    

    look into the first rep to see the coxph output. You will see the model object stored in the cells of the dataframe allowing easier access.

    nested %>% filter(rep==1)
    

    With each model object, now use broom to get the parameter estimates and the prediction from the model into the nested dataset

    nested <- nested %>% 
      mutate(
        ests = model %>% map(broom::tidy)
      )
    

    tidyr::unnest to view your predictions for fitting each resampled dataset

    ests <- unnest(nested,ests,.drop=TRUE) %>% dplyr::select(rep,estimate:conf.high)
    

    In this case since I am repeating the same dataset 100 times, the pvalue will be the same, but in your case you will have 100 different datasets and hence 100 different p.values.

    ggplot(data=ests,aes(y=p.value,x=rep))+geom_point()
    

    Vijay