Search code examples
rseurat

Subset a complex sparse matrix based on a logical column


I'm working with a Seurat object, and after doing some quality control I have a column of metadata called discard containing TRUE or FALSE based on whether the row in question failed QC and should be deleted. How do I do this? I have tried all the different flavours of subset I can find documentation on, but the only way that didn't give me an error was subset(object, [email protected]$discard) which gives me a matrix of only the rows that should be discarded!! subset(object, [email protected]$discard) is apparently not possible with this type of object. How can I do this without iterating to make an inverse QC called "keep" or something equally ridiculous? Any help gratefully appreciated!


Solution

  • You add ! to get the complement of your boolean, below is an example that discards the first 10 columns in your sample, 1st 10 columns.

    For samples, you subset it like a matrix:

    library(Seurat)
    # we use a example dataset
    dim(pbmc_small)
    [1] 230  80
    
    [email protected]$discard = rep(c(TRUE,FALSE),c(10,70))
    newdata = pbmc_small[,[email protected]$discard]
    
    dim(newdata)
    [1] 230  70
    
    table(colnames(pbmc_small)[1:10] %in% colnames(newdata))
    
    FALSE 
       10
    

    Otherwise, you can also use the names to subset:

    id_keep = colnames(pbmc_small)[[email protected]$discard]
    newdata = subset(pbmc_small,cells=id_keep)
    dim(newdata)
    [1] 230  70