Search code examples
rsurvival

Does anyone know what the difference between n and strata in a survobject?


Whenever I use survfit in R I get different values for n and strata: For example I get n: 150, 167 (add up to 317 which is the total input) strata: 149, 163

From the help page ?survival::survfit.object:

n = total number of subjects in each curve.

strata = if there are multiple curves, this component gives the number of elements of the time etc. vectors corresponding to the first curve, the second curve, and so on. The names of the elements are labels for the curves.

I don't understand why the numbers are different.

EDIT: I did think about the issue being the repeated time data points, as you can see in the example database there are 9 instances of duplicate values (18 in total). This would mean only 317 - 9 = 308 values are used. But strata adds up to: 149+163=312, not 308. The code used is:

library(survival)
library(survminer)
survival <- surv_fit(Surv(time = Time,event = Event)~Group,data=x, conf.int=0.95)

Update: It is to do with repeated times, within each group. If I separate the data in group A and group B there is 1 duplicate event in group A and 4 duplicate events in froup B. Therefore there would be 317 - 1 - 4 = 312 time points in the plot.

And in each group it would be: A: 150 - 1 = 149 B: 167 - 4 = 163

As strata shows.


Solution

  • Thank you to @kath for their help.

    n refers to how many samples are in each group.

    strata refers to the number distinct time elements in each group, i.e. removing duplicates within each group.