Search code examples
rcox-regressionsurvival

R - cox.zph() not returning rho values, p-values differ from examples


When I run cox.zph on a Cox model, I get a different type of return than I am seeing everywhere else.

I tried running the following code:

library(survival) #version 3.1-7
library(survminer) #version 0.4.6

res.cox <- coxph(Surv(time, status) ~ age + sex + wt.loss, data = lung)
#lung data is in the survival package and loads from there.

( test.ph <- cox.zph(res.cox) )

This gives my the following return:

         chisq df    p
age     0.5077  1 0.48
sex     2.5489  1 0.11
wt.loss 0.0144  1 0.90
GLOBAL  3.0051  3 0.39

However, examples elsewhere (including the one I'm am trying to follow here) return a table with a "rho" column, as below:

            rho chisq     p
age     -0.0483 0.378 0.538
sex      0.1265 2.349 0.125
wt.loss  0.0126 0.024 0.877
GLOBAL       NA 2.846 0.416

In addition, whatever it throwing this off appears to also be altering my chi-sq and p-values as well.

Furthermore, when I subsequently try to plot the Schoenfeld residuals using ggcoxzph(test.ph) I get the following plots:

My Schoenfeld residual plots

Versus the example:

sthda version of the same plots

These issues are causing a massive block to my group's efforts on a current project, and any help offered will be much appreciated.

Thanks in advance!


Solution

  • Try updating your version of survival package. It works for me as it should (print method shows "rho").

    I'm using version 2.43-3

    EDIT: 42 was right, I was installing from an out-of-date mirror. I just updated to the latest version.

    Of course one solution is you roll back your install to an older version. But assuming you don't want to do that...

    It appears you will need to calculate what you want yourself. I don't know if this is the most concise way but I simply adapted the source code from the old version of the package into a function where you can pass your fitted object and your cox.zph test object.

    library(survival) #version 3.1-7
    library(survminer) #version 0.4.6
    
    res.cox <- coxph(Surv(time, status) ~ age + sex + wt.loss, data = lung)
    #lung data is in the survival package and loads from there.
    
    test.ph <- cox.zph(res.cox) 
    
    your_func <- function(fit, transform = "km", new_cox.zph = NULL) {
      sresid <- resid(fit, "schoenfeld")
      varnames <- names(fit$coefficients)
      nvar <- length(varnames)
      ndead <- length(sresid)/nvar
      if (nvar == 1) {
        times <- as.numeric(names(sresid)) 
      } else {
        times <- as.numeric(dimnames(sresid)[[1]])
      }
    
      if (is.character(transform)) {
        tname <- transform
        ttimes <- switch(transform, identity = times, rank = rank(times), 
                         log = log(times), km = {
                           temp <- survfitKM(factor(rep(1, nrow(fit$y))), 
                                             fit$y, se.fit = FALSE)
                           t1 <- temp$surv[temp$n.event > 0]
                           t2 <- temp$n.event[temp$n.event > 0]
                           km <- rep(c(1, t1), c(t2, 0))
                           if (is.null(attr(sresid, "strata"))) 1 - km else (1 - 
                                                                               km[sort.list(sort.list(times))])
                         }, stop("Unrecognized transform"))
      }
      else {
        tname <- deparse(substitute(transform))
        if (length(tname) > 1) 
          tname <- "user"
        ttimes <- transform(times)
      }
      xx <- ttimes - mean(ttimes)
      r2 <- sresid %*% fit$var * ndead
      test <- xx %*% r2
      corel <- c(cor(xx, r2))
      cbind(rho = c(corel,NA), new_cox.zph$table)
    }
    

    Call the function

    your_func(fit = res.cox, new_cox.zph = test.ph)
    
                    rho      chisq df         p
    age     -0.04834523 0.50774082  1 0.4761185
    sex      0.12651872 2.54891522  1 0.1103700
    wt.loss  0.01257876 0.01444092  1 0.9043482
    GLOBAL           NA 3.00505434  3 0.3908466
    

    UPDATE

    I don't know about the difference in the chisq and p calculations between versions. You may want to check release notes for what changed between versions. But for your purposes, I'm not sure there will be a difference in "rho" which appears to simply be a pearson correlation between 1. difference from observation time and mean time [ttimes - mean(ttimes)] and 2. fitted values multiplied by schoenfeld residuals (scaled by number dead). From the code..

    sresid <- resid(fit, "schoenfeld")
    xx <- ttimes - mean(ttimes)
    r2 <- sresid %*% fit$var * ndead
    test <- xx %*% r2
    corel <- c(cor(xx, r2))
    ##corel is rho