Search code examples
python-3.xbioinformaticspipelinesnakemake

--latency-wait Error MissingOutputException


I am working on snakemake to make a pipeline for my work. My input files are (Example fastq files)

/project/ateeq/PROJECT/snakemake-example/raw_data/H667-1_R1.fastq.gz
/project/ateeq/PROJECT/snakemake-example/raw_data/H667-1_R2.fastq.gz
/project/ateeq/PROJECT/snakemake-example/raw_data/H667-2_R1.fastq.gz
/project/ateeq/PROJECT/snakemake-example/raw_data/H667-2_R2.fastq.gz

I have written the following code for trimming the data using FastP

"""
Author: Dave Amir
Affiliation: St Lukes
Aim: A simple Snakemake workflow to process paired-end stranded RNA-Seq.
Date: 11 June 2015
Run: snakemake   -s snakefile   
Latest modification: 
  - todo
"""

# This should be placed in the Snakefile.

##-----------------------------------------------##
## Working directory                             ##
## Adapt to your needs                           ##
##-----------------------------------------------##

BASE_DIR = "/project/ateeq/PROJECT"
WDIR = BASE_DIR + "/snakemake-example"
workdir: WDIR
#message("The current working directory is " + WDIR)

##--------------------------------------------------------------------------------------##
## Variables declaration                          
## Declaring some variables used by topHat and other tools... 
## (GTF file, INDEX, chromosome length)
##--------------------------------------------------------------------------------------##

INDEX = BASE_DIR + "/ref_files/hg19/assembly/"
GTF   = BASE_DIR + "/hg19/Hg19_CTAT_resource_lib/ref_annot.gtf"
CHR   = BASE_DIR + "/static/humanhg19.annot.csv"
FASTA = BASE_DIR + "/ref_files/hg19/assembly/hg19.fasta"

##--------------------------------------------------------------------------------------##
## The list of samples to be processed
##--------------------------------------------------------------------------------------##

SAMPLES, = glob_wildcards("/project/ateeq/PROJECT/snakemake-example/raw_data/{smp}_R1.fastq.gz")
NB_SAMPLES = len(SAMPLES)

for smp in SAMPLES: 
    message:("Sample " + smp + " will be processed")


##--------------------------------------------------------------------------------------##
## Our First rule  - sample trimming
##--------------------------------------------------------------------------------------##
rule final:
    input: expand("/project/ateeq/PROJECT/snakemake-example/trimmed/{smp}_R1_trimmed.fastq", smp=SAMPLES)

rule trimming:
    input:  fwd="/project/ateeq/PROJECT/snakemake-example/raw_data/{smp}_R1.fastq.gz",rev="/project/ateeq/PROJECT/snakemake-example/raw_data/{smp}_R2.fastq.gz"
    output: fwd="/project/ateeq/PROJECT/snakemake-example/trimmed/{smp}_R1_trimmed.fastq", rev="/project/ateeq/PROJECT/snakemake-example/trimmed/{smp}_R2_trimmed.fastq", rep="/project/ateeq/PROJECT/snakemake-example/report/{smp}.html"

threads: 30

message: """--- Trimming."""

shell: """
    fastp -i {input.fwd} -I {input.rev} -o {output.fwd} -O {output.rev} --detect_adapter_for_pe --disable_length_filtering --correction --qualified_quality_phred 30 --thread 16    --html {output.rep} --report_title "Fastq Quality Control Report" &>>{input.fwd}.log


"""

when I am running the pipeline it shows the following error

 MissingOutputException in line 51 of /project/ateeq/PROJECT/snakemake-example/snakefile:
Missing files after 5 seconds:
/project/ateeq/PROJECT/snakemake-example/trimmed/H667-1_R1_trimmed.fastq
/project/ateeq/PROJECT/snakemake-example/trimmed/H667-1_R2_trimmed.fastq
/project/ateeq/PROJECT/snakemake-example/report/H667-1.html
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.

Though I have increased the latency wait period for 2000 sec it still ends up throwing an error

snakemake -s snakefile -j 30 --latency-wait 2000

I am using snakemake version 5.4.5 and Python 3.6.8. Please let me know where I am going wrong, it would be a great help to me

Thanks for kind Help,

Sincerely,

Dave


Solution

  • I don't know if this is due to the copy & paste but the indentation is wrong:

    rule trimming:
        input:  fwd="/project/ateeq/PROJECT/snakemake-example/raw_data/{smp}_R1.fastq.gz",rev="/project/ateeq/PROJECT/snakemake-example/raw_data/{smp}_R2.fastq.gz"
        output: fwd="/project/ateeq/PROJECT/snakemake-example/trimmed/{smp}_R1_trimmed.fastq", rev="/project/ateeq/PROJECT/snakemake-example/trimmed/{smp}_R2_trimmed.fastq", rep="/project/ateeq/PROJECT/snakemake-example/report/{smp}.html"
    
    threads: 30
    
    message: """--- Trimming."""
    
    shell: """
        fastp -i {input.fwd} -I {input.rev} -o {output.fwd} -O {output.rev} --detect_adapter_for_pe --disable_length_filtering --correction --qualified_quality_phred 30 --thread 16    --html {output.rep} --report_title "Fastq Quality Control Report" &>>{input.fwd}.log
    """
    

    Shouldn't it be:

    rule trimming:
        input:  
            fwd="/project/ateeq/PROJECT/snakemake-example/raw_data/{smp}_R1.fastq.gz",
            rev="/project/ateeq/PROJECT/snakemake-example/raw_data/{smp}_R2.fastq.gz"
        output: 
            fwd="/project/ateeq/PROJECT/snakemake-example/trimmed/{smp}_R1_trimmed.fastq", 
            rev="/project/ateeq/PROJECT/snakemake-example/trimmed/{smp}_R2_trimmed.fastq", 
            rep="/project/ateeq/PROJECT/snakemake-example/report/{smp}.html"
        threads: 30
        message: """--- Trimming."""
        shell: """
            fastp -i {input.fwd} -I {input.rev} -o {output.fwd} -O {output.rev} --detect_adapter_for_pe --disable_length_filtering --correction --qualified_quality_phred 30 --thread 16    --html {output.rep} --report_title "Fastq Quality Control Report" &>>{input.fwd}.log
        """