Search code examples
rrna-seqseurat

Error in NormalizeData.default running DoubletFinder on an integrated seurat object in R


I'm trying to run DoubletFinder on a seurat object resulting from the integration of various datasets.

The Seurat object has 2 assays: RNA & integrated.

The integrated seurat object have been fully processed:

  • Normalization and FindVariableFeature pre-integration

  • ScaleData, RunPCA, FindNeighbors, FindClusters, RunUMAP on the integrated object.

The paramSweep_v3() function of DoubletFinder gives the following output:

sweep.res.list <- paramSweep_v3(integrated.seu, PCs = 1:38, sct = FALSE)
Loading required package: fields
Loading required package: spam
Loading required package: dotCall64
Loading required package: grid
Spam version 2.5-1 (2019-12-12) is loaded.
Type 'help( Spam)' or 'demo( spam)' for a short introduction 
and overview of this package.
Help for individual functions is also obtained by adding the
suffix '.spam' to the function name, e.g. 'help( chol.spam)'.

Attaching package: ‘spam’

The following object is masked from ‘package:R.utils’:

    cleanup

The following objects are masked from ‘package:base’:

    backsolve, forwardsolve

Loading required package: maps
See https://github.com/NCAR/Fields for
 an extensive vignette, other supplements and source code 
[1] "Creating artificial doublets for pN = 5%"
[1] "Creating Seurat object..."
[1] "Normalizing Seurat object..."
Error in NormalizeData.default(object = GetAssayData(object = object,  : 
  trying to get slot "params" from an object of a basic class ("NULL") with no slots

Why does this indicate there are no slots in my Seurat object?


Solution

  • the DoubletFinder readme clearly states that you shouldn't run it on an aggregated dataset. It will produce false artificial doublets:

    [https://github.com/chris-mcginnis-ucsf/DoubletFinder][1]

    Do not apply DoubletFinder to aggregated scRNA-seq data representing multiple distinct samples (e.g., multiple 10X lanes). For example, if you run DoubletFinder on aggregated data representing WT and mutant cell lines sequenced across different 10X lanes, artificial doublets will be generated from WT and mutant cells, which cannot exist in your data. These artificial doublets will skew results. Notably, it is okay to run DoubletFinder on data generated by splitting a single sample across multiple 10X lanes.

    I did it by reading in the individual samples, cluster them individually, run DoubletFinder, remove doublets and then run the integration workflow.