Search code examples
rbioconductor

Subset by samples for an ExpressionSet object


I have an ExpressionSet object with 100 samples:

> length(sampleNames(eset1))
100

I also have a vector of the names of 75 samples (not the data itself):

> length(vecOf75)
75

How can I subset eset1 (and save it) according to the 75 sample names? That is, I want to disregard those samples in eset1 whose names are not listed in vecOf75. Bear in mind that some of the samples corresponding to the 75 sample names may not be in eset1. Thus,

> length(sampleNames(eset1))

should now give something <75.


Solution

  • An ExpressionSet can be subset like a matrix, so maybe

    eset2 = eset1[, sampleNames(eset1) %in% vecOf75]
    

    or if all(vecOf75 %in% sampleNames(eset1)) then just

    eset1[, vecOf75]
    

    Not sure what 'save' means; either save(eset2, "some_file.rda") or extracting the components exprs(eset2), pData(eset2) etc., and using write.table and other standard R functions.