Search code examples
rsurvival-analysiscox

Creating a full vs. reduced model with one explanatory variable


I am attempting to create a Cox proportional hazards model with only one explanatory variable. To perform a likelihood ratio test, I know I need a full and reduced model. I also know that a full model would be a separate mean for each group, and a reduced model would use an overall mean for the whole data set. How can I ensure I am setting this up properly in R? In this model z is 1 if the patient had heart surgery, and z is 0 otherwise

I have:

model<-coxph(Surv(time,delta)~z,method='breslow',data=heartdata)

X.lr <- 2*(model$loglik[2]-model$loglik[1])

Does this achieve that? I get an answer I just want to know whether this makes a full vs. reduced model since I don't have other variables to use?


Solution

  • In this case, this does work, but I think there's a better/more transparent solution using update() and anova() (I wasn't even aware that the log-likelihood component of coxph models included both the full and the null deviances).

    Using a built-in data set from the survival package:

    ## drop NAs so we are using the same data set for full & reduced models
    lungna <- na.omit(lung)
    ## fit full model
    m1 <- coxph(Surv(time, status) ~ ph.ecog, data=lungna)
    ## update model to fit intercept only (` ~ 1 ` replaces the RHS of the formula):
    ## ~ 1 means "intercept only" in R formula notation
    m0 <- update(m1, . ~ 1)
    ## anova() runs a likelihood-ratio test
    anova(m0,m1)
    

    Results:

    Analysis of Deviance Table
     Cox model: response is  Surv(time, status)
     Model 1: ~ 1
     Model 2: ~ ph.ecog
       loglik  Chisq Df P(>|Chi|)    
    1 -508.12                        
    2 -501.91 12.409  1 0.0004273 ***
    

    You can confirm that 2*diff(m1$loglik) gives 12.409, the same value reported by anova() for the deviance ("Chisq") difference, and that pchisq(chisq_val, df = 1, lower.tail = FALSE) gives the reported p-value.