I have a snakemake rule that is trying to pull from this directory called merged. This contains two separate scRNA datasets. I want to utilize snakemake in conjunction with cellranger to run any number of samples. I am getting an error below in my cellranger_count rule that I am not understanding and google isnt helping. Could someone please make this a teachable moment?
merged
SC111111-TTATTCGAGG-AGCAGGACAG_merged_R1.fastq.gz SC222222-TGCGCGGTTT-TTTATCCTTG_merged_R1.fastq.gz
SC111111-TTATTCGAGG-AGCAGGACAG_merged_R2.fastq.gz SC222222-TGCGCGGTTT-TTTATCCTTG_merged_R2.fastq.gz
rule cellranger_count:
input: rules.merge_fastqs.output
output:
maxtrix_h5 = '{sampleID}_TenXAnalysis/outs/raw_feature_bc_matrix.h5',
metrics = '{sampleID}_TenXAnalysis/outs/metrics_summary.csv',
barcodes = '{sampleID}_TenXAnalysis/outs/raw_feature_bc_matrix/barcodes.tsv.gz',
features = '{sampleID}_TenXAnalysis/outs/raw_feature_bc_matrix/features.tsv.gz',
matrix = '{sampleID}_TenXAnalysis/outs/raw_feature_bc_matrix/matrix.mtx.gz'
threads: 16
params:
ref = 'PATH/yard/apps/refdata-gex-GRCh38-2020-A',
#sample_id = '{sampleID}_merged'
## id = unique run ID string
## fastqs = Path to data
## sample = Sample names as specified in the sample sheet
## transcriptome = Path to Cell Ranger compatible transcritpome reference
## localcores = tells cellragner how many cores to use
## localmem = how much mem to use
shell: """
module load cellranger/6.1.2
rm -rf {wildcards.sampleID}_TenXAnalysis
cellranger count --id={wildcards.sampleID}_TenXAnalysis \
--fastqs={input} \
--sample={wildcards.sampleID} \
--transcriptome={params.ref} \
--localcores={threads} \
--localmem=128
"""
logfile error:
error: Found argument 'merged/SC111111-TTATTCGAGG-AGCAGGACAG_merged_R2.fastq.gz' which wasn't expected, or isn't valid in this context
If you tried to supply `merged/SC111111-TTATTCGAGG-AGCAGGACAG_merged_R2.fastq.gz` as a PATTERN use `-- merged/SC111111-TTATTCGAGG-AGCAGGACAG_merged_R2.fastq.gz
Do you expect rules.merge_fastqs.output
to be a directory or a list of fastq files?
If I understand your post correctly, rules.merge_fastqs.output
is a list of fastq files and this is passed to cellranger as a space-separated list, i.e.:
--fastqs=merged/fastq1.fastq.gz merged/fastq2.fastq.gz merged/fastq3.fastq.gz etc
However, callranger doesn't seem to support this way of passing multiple fastq files.
So I think the issue is not so much with snakemake but with the way you execute cellranger. Try running snakemake with -p
option to see what commands are actually executed and check if this is what you expect.
If this doesn't help, post the rule merge_fastqs