Search code examples
rcorrelationsurvival-analysis

Estimating correlation with one variable of censored data in R


I am looking for a method in R to get the estimate of correlation (and the associated p-value) between a partially censored time-to-event data and a continuous variable (e.g. body length).

Here is the sample of my data - time observations censored at 900 (seconds):

length <- c(12.10, 11.00, 9.59, 10.38, 11.10, 9.39)
timeto <- c(149, 900, 26, 3, 0, 900)
event <- c(1, 0, 1, 1, 1, 0)
data <- data.frame(length, timeto, event)

Solution

  • It sounds like you want a time-to-event analysis where event rate is dependent on a continuous variable. You can do this using a Cox proportional hazards model, which is really easy to do with the survival package:

    library(survival)
    
    # Create a Surv object from times and events:
    data$surv <- Surv(timeto, event = event)
    
    # See the summary of the Cox model:
    summary(coxph(surv ~ length, data = data))
    #> Call:
    #> coxph(formula = surv ~ length, data = data)
    #> 
    #>   n= 6, number of events= 4 
    #> 
    #>          coef exp(coef) se(coef)     z Pr(>|z|)
    #> length 0.1698    1.1850   0.4808 0.353    0.724
    #> 
    #>        exp(coef) exp(-coef) lower .95 upper .95
    #> length     1.185     0.8439    0.4618     3.041
    #> 
    #> Concordance= 0.643  (se = 0.152 )
    #> Likelihood ratio test= 0.12  on 1 df,   p=0.7
    #> Wald test            = 0.12  on 1 df,   p=0.7
    #> Score (logrank) test = 0.13  on 1 df,   p=0.7
    

    Created on 2020-06-21 by the reprex package (v0.3.0)