Search code examples
rexpressionbioconductorsurvival-analysis

How to only include specific cases of the expressionset (Eset) in our survival analysis (KM curves) in R?


I have a question regarding KM analysis. I have ExpressionSet like this of the first 10 cases:

eset()

ExpressionSet (storageMode: lockedEnvironment)
assayData: 6 features, 6 samples 
  element names: exprs 
protocolData: none
phenoData
  sampleNames: 1 2 ... 6 (6 total)
  varLabels: age_at_diagnosis last_follow_up_status ... lymph_nodes_removed (9 total)
  varMetadata: labelDescription
featureData: none
experimentData: use 'experimentData(object)'
Annotation:  

This is my expression of eset:

        1        2        3        4        5        6
a 8.676978 9.653589 9.033589 8.814855 8.736406 9.274265
b 5.298711 5.378801 5.606122 5.316155 5.303613 5.449802
c 5.430877 5.199253 5.449121 5.309371 5.438538 5.347851
d 6.075331 6.687887 5.910885 5.628740 6.392422 5.908698
e 5.595625 6.010127 5.683969 5.479983 6.013500 5.939949
f 5.453928 5.454185 5.501577 5.471941 5.525027 5.531743

and here is the pData:

    age Status  MEN group grade size stage LNP LNR     time  mn doc
1 52.79      d post     4     2   18     2   1  12 3.865753 pos   0
2 32.61      d  pre     3     3   16     2   5  23 1.679452 neg   1
3 66.83      a post     4     3   15     3   8  17 5.616438 pos   0
4 71.21      a post     4     3   21     2   1  12 1.169863 pos   1
5 76.84 d-d.s. post     4     3   50     2   3  24 3.602740 pos   1
6 60.77      a post     4     2   23     2   0   2 1.367123 pos   0

I know how to generate a KM curves for the whole dataset here is my code; I only give you a data of the first 10 cases as it's a limitation of space in stack website:

library(survival)

c <- Surv(as.numeric(ab$time), ab$doc)

plot(survfit( c ~ as.factor(ab$mn)))

So, my question is how can I modify this code to just for cases that are ab$mn == 'neg'

Thanks in advance,


Solution

  • I would follow the advice of Terry Therneau regarding how to use the Surv function, which is not to build Surv-objects outside the coxph function. This will also let you use the subset-parameter that is a handy feature of coxph:

    plot(survfit( Surv(as.numeric(time), doc) ~ as.factor(mn), data=ab, subset = mn == 'neg' ))