Search code examples
rsurvival-analysiscox-regressionmultivariate-testing

How to get the wald test of a specific variable in a multivariate Coxph?


I fitted a multivariate Cox model using the R survival package like follow:

library(survival)
data(lung)
res.cox1 <- coxph(Surv(time, status) ~  sex + ph.karno + wt.loss, data =  lung)
res.cox1
Call:
coxph(formula = Surv(time, status) ~ sex + ph.karno + wt.loss, 
    data = lung)

              coef exp(coef)  se(coef)      z       p
sex      -0.521839  0.593428  0.174454 -2.991 0.00278
ph.karno -0.015243  0.984873  0.005988 -2.546 0.01091
wt.loss  -0.002523  0.997480  0.006233 -0.405 0.68558

Likelihood ratio test=16.42  on 3 df, p=0.0009298
n= 214, number of events= 152 
   (14 observations deleted due to missingness)

How can one get the 3 values of the Wald test of each variable (sex, ph.karno and wt.loss) in a multivariate Cox model (sex + ph.karno + wt.loss)?

I tried to look at the structure of the coxph and the summary of the coxph object, and I found only one single value of the wald test $wald.test : num 16.5, $ waldtest : Named num [1:3] 1.65e+01 3.00 8.81e-04 ..- attr(*, "names")= chr [1:3] "test" "df" "pvalue"!

What does this test value correspond to? How to get the 3 values of the Wald test of sex, ph.karno and wt.loss ?

str(res.cox1)
List of 20
 $ coefficients     : Named num [1:3] -0.52184 -0.01524 -0.00252
  ..- attr(*, "names")= chr [1:3] "sex" "ph.karno" "wt.loss"
 $ var              : num [1:3, 1:3] 3.04e-02 -6.78e-05 2.77e-05 -6.78e-05 3.59e-05 ...
 $ loglik           : num [1:2] -680 -672
 $ score            : num 16.9
 $ iter             : int 4
 $ linear.predictors: num [1:214] 0.0756 0.0756 0.0857 -0.039 0.7232 ...
 $ residuals        : Named num [1:214] -0.147 -2.93 0.58 -1.613 -5.599 ...
  ..- attr(*, "names")= chr [1:214] "2" "3" "4" "5" ...
 $ means            : Named num [1:3] 1.4 82.06 9.83
  ..- attr(*, "names")= chr [1:3] "sex" "ph.karno" "wt.loss"
 $ method           : chr "efron"
 $ n                : int 214
 $ nevent           : num 152
 $ terms            :Classes 'terms', 'formula'  language Surv(time, status) ~ sex + ph.karno + wt.loss
  .. ..- attr(*, "variables")= language list(Surv(time, status), sex, ph.karno, wt.loss)
  .. ..- attr(*, "factors")= int [1:4, 1:3] 0 1 0 0 0 0 1 0 0 0 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:4] "Surv(time, status)" "sex" "ph.karno" "wt.loss"
  .. .. .. ..$ : chr [1:3] "sex" "ph.karno" "wt.loss"
  .. ..- attr(*, "term.labels")= chr [1:3] "sex" "ph.karno" "wt.loss"
  .. ..- attr(*, "specials")=Dotted pair list of 2
  .. .. ..$ strata: NULL
  .. .. ..$ tt    : NULL
  .. ..- attr(*, "order")= int [1:3] 1 1 1
  .. ..- attr(*, "intercept")= num 1
  .. ..- attr(*, "response")= int 1
  .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. ..- attr(*, "predvars")= language list(Surv(time, status), sex, ph.karno, wt.loss)
  .. ..- attr(*, "dataClasses")= Named chr [1:4] "nmatrix.2" "numeric" "numeric" "numeric"
  .. .. ..- attr(*, "names")= chr [1:4] "Surv(time, status)" "sex" "ph.karno" "wt.loss"
 $ assign           :List of 3
  ..$ sex     : int 1
  ..$ ph.karno: int 2
  ..$ wt.loss : int 3
 $ wald.test        : num 16.5
 $ concordance      : Named num [1:7] 11071 6046 96 22 0 ...
  ..- attr(*, "names")= chr [1:7] "concordant" "discordant" "tied.x" "tied.y" ...
 $ na.action        : 'omit' Named int [1:14] 1 20 36 44 56 63 108 138 178 183 ...
  ..- attr(*, "names")= chr [1:14] "1" "20" "36" "44" ...
 $ y                : 'Surv' num [1:214, 1:2]  455  1010+  210   883  1022+  310   361   218   166   170  ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:214] "2" "3" "4" "5" ...
  .. ..$ : chr [1:2] "time" "status"
  ..- attr(*, "type")= chr "right"
 $ timefix          : logi TRUE
 $ formula          :Class 'formula'  language Surv(time, status) ~ sex + ph.karno + wt.loss
  .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
 $ call             : language coxph(formula = Surv(time, status) ~ sex + ph.karno +      wt.loss, data = lung)
 - attr(*, "class")= chr "coxph"


str(summary(res.cox1))
List of 14
 $ call        : language coxph(formula = Surv(time, status) ~ sex + ph.karno +      wt.loss, data = lung)
 $ fail        : NULL
 $ na.action   : 'omit' Named int [1:14] 1 20 36 44 56 63 108 138 178 183 ...
  ..- attr(*, "names")= chr [1:14] "1" "20" "36" "44" ...
 $ n           : int 214
 $ loglik      : num [1:2] -680 -672
 $ nevent      : num 152
 $ coefficients: num [1:3, 1:5] -0.52184 -0.01524 -0.00252 0.59343 0.98487 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:3] "sex" "ph.karno" "wt.loss"
  .. ..$ : chr [1:5] "coef" "exp(coef)" "se(coef)" "z" ...
 $ conf.int    : num [1:3, 1:4] 0.593 0.985 0.997 1.685 1.015 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:3] "sex" "ph.karno" "wt.loss"
  .. ..$ : chr [1:4] "exp(coef)" "exp(-coef)" "lower .95" "upper .95"
 $ logtest     : Named num [1:3] 16.42029 3 0.00093
  ..- attr(*, "names")= chr [1:3] "test" "df" "pvalue"
 $ sctest      : Named num [1:3] 1.69e+01 3.00 7.52e-04
  ..- attr(*, "names")= chr [1:3] "test" "df" "pvalue"
 $ rsq         : Named num [1:2] 0.0739 0.9983
  ..- attr(*, "names")= chr [1:2] "rsq" "maxrsq"
 $ waldtest    : Named num [1:3] 1.65e+01 3.00 8.81e-04
  ..- attr(*, "names")= chr [1:3] "test" "df" "pvalue"
 $ used.robust : logi FALSE
 $ concordance : Named num [1:2] 0.646 0.0274
  ..- attr(*, "names")= chr [1:2] "C" "se(C)"
 - attr(*, "class")= chr "summary.coxph"

Thank you!


Solution

  • A "Wald test" is based on the assumption that parameter values from regression processes will be normally distributed. You examine the ratio of a coefficient's estimate ("coef") divided by the estimate's standard error of the estimate ("coef(se)") and seeing if the 95% confidence interval for that ratio would include the value zero. Stated operationally: take coef +/- 1.96*se(coef) and see if the interval includes zero. Alternatively and equivalently, you can take the ratio: coef/se(coef), and see if it's absolute value is greater than 1.96. Perhaps I'm being pedantic when I say that a "test" is a yes/no result, answering the question "does the ratio value lie in a critical interval or not", whereas a "test statistic", like a z-value, is a pure number.

    There are actually 4 Wald tests reported in the summary you constructed. Three of them are for the individual coefficients and one of them is for the overall model and that is the one named "wald". But you don't want the overall model Wald test. You want the results from the "coefficient" matrix of the summary()-processed result (not the "coefficient" value from the coxph() result.) When you take such ratios, it's analyzed as a z-test, so you do not square the statistic (unless of course, you want to use a chi-square table, which is when Z^2 would be used for the assessment.)

    summ.coef <- summary(res.cox1)$coefficients
    
    ( Wald.ratios <- summ.coef[,"coef"]/summ.coef[,"se(coef)"] )
           sex   ph.karno    wt.loss 
    -2.9912645 -2.5456273 -0.4048609 
    identical(Wald.ratios, summ.coef[, "z"])
    #[1] TRUE
    

    If you wanted to focus on a single variable by name:

     summ.coef["sex", "coef"]/summ.coef["sex", "se(coef)"]