I am trying to use the cforest function in the R package party to analyse some right-censored survival data. Every time I use the predict function I get Inf for each value, which means that a concordance index cannot be generated.
My data can be downloaded here: https://www.dropbox.com/s/nt9s3p1rdafq465/test_data.csv?dl=0
Example:
library(party)
library(survival)
mydata <- read.csv(file="test_data.csv", header=TRUE, sep=",",row.names=NULL)
train<-head(mydata, n=800)
test<-tail(mydata, n=37)
cif_result <- cforest(Surv(timeToEvent, status) ~ V1 + V2 + V3 + V4 + V5 + V6,
data = train,
control=cforest_classical())
cforest_pred <- predict(object = cif_result, newdata = test)
cforest_pred
837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856
Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf
857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873
Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf Inf
Am I doing something wrong? Why does cforest only predict Inf on this data?
The predict()
method for survival trees/forests in the party
package returns the median survival time. As there are observed events for less than 20% of the observations, a finite median survival time cannot be computed. Hence it is Inf
. As an example consider the full-sample fit:
m <- survfit(Surv(timeToEvent, status) ~ 1, data = train)
plot(m)