Search code examples
snakemake

Workflow always results in "Nothing to do" even when forcing rules


So as the title says I can't bring my workflow to execute anything, except the all rule... When Executing the all rule it correctly finds all the input files, so the configfile is okay, every path is correct.

when trying to run without additional tags I get

Building DAG of jobs...
Checking status of 0 jobs.
Nothing to be done

Things I tried:

  1. -f rcorrector -> only all rule
  2. filenameR1.fcor_val1.fq -> MissingRuleException (No Typos)
  3. --forceall -> only all rule
  4. Some more fiddling I can't formulate clearly

please Help

from os import path

configfile:"config.yaml"

RNA_DIR = config["RAW_RNA_DIR"]
RESULT_DIR = config["OUTPUT_DIR"]

FILES = glob_wildcards(path.join(RNA_DIR, '{sample}R1.fastq.gz')).sample
############################################################################
rule all:
    input:
        r1=expand(path.join(RNA_DIR, '{sample}R1.fastq.gz'), sample=FILES),
        r2=expand(path.join(RNA_DIR, '{sample}R2.fastq.gz'), sample=FILES)


#############################################################################

rule rcorrector:
    input:
        r1=path.join(RNA_DIR, '{sample}R1.fastq.gz'),
        r2=path.join(RNA_DIR, '{sample}R2.fastq.gz')
    output:
        o1=path.join(RESULT_DIR, 'trimmed_reads/corrected/{sample}R1.cor.fq'),
        o2=path.join(RESULT_DIR, 'trimmed_reads/corrected/{sample}R2.cor.fq')
    #group: "cleaning"
    threads: 8
    params: "-t {threads}"
    envmodules:
        "bio/Rcorrector/1.0.4-foss-2019a"
    script:
        "scripts/Rcorrector.py"

############################################################################

rule FilterUncorrectabledPEfastq:
    input:
        r1=path.join(RESULT_DIR, 'trimmed_reads/corrected/{sample}R1.cor.fq'),
        r2=path.join(RESULT_DIR, 'trimmed_reads/corrected/{sample}R2.cor.fq')
    output:
        o1=path.join(RESULT_DIR, "trimmed_reads/filtered/{sample}R1.fcor.fq"),
        o2=path.join(RESULT_DIR, "trimmed_reads/filtered/{sample}R2.fcor.fq")
    #group: "cleaning"
    envmodules:
        "bio/Jellyfish/2.2.6-foss-2017a",
        "lang/Python/2.7.13-foss-2017a"
        #TODO: load as module
    script:
        "/scripts/filterUncorrectable.py"

#############################################################################

rule trim_galore:
    input:
        r1=path.join(RESULT_DIR, "trimmed_reads/filtered/{sample}R1.fcor.fq"),
        r2=path.join(RESULT_DIR, "trimmed_reads/filtered/{sample}R2.fcor.fq")
    output:
        o1=path.join(RESULT_DIR, "trimmed_reads/{sample}.fcor_val1.fq"),
        o2=path.join(RESULT_DIR, "trimmed_reads/{sample}.fcor_val2.fq")
    threads: 8
    #group: "cleaning"
    envmodules:
        "bio/Trim_Galore/0.6.5-foss-2019a-Python-3.7.4"
    params:
        "--paired --retain_unpaired --phred33 --length 36 -q 5 --stringency 1 -e 0.1 -j {threads}"
    script:
        "scripts/trim_galore.py"

Solution

  • In snakemake, you define final output files of the pipeline as target files and define them as inputs in first rule of the pipeline. This rule is traditionally named as all (more recently as targets in snakemake doc).

    In your code, rule all specifies input files of the pipeline, which already exists, and therefore snakemake doesn't see anything to do. It just instead needs to specify output files of interest from the pipeline.

    rule all:
        input:
            expand(path.join(RESULT_DIR, "trimmed_reads/{sample}.fcor_val{read}.fq"), sample=FILES, read=[1,2]),
    

    Why your attempted methods didn't work?

    1. -f not working:

    As per doc:

    --force, -f     
    
    Force the execution of the selected target or the first rule regardless of already created output.
    
    Default: False
    

    In your code, this means rule all, which doesn't have output defined, and therefore nothing happened.

    1. filenameR1.fcor_val1.fq

    This doesn't match output of any of the rules and therefore the error MissingRuleException.

    1. --forceall

    Same reasoning as that for -f flag in your case.

    --forceall, -F  
    
    Force the execution of the selected (or the first) rule and all rules it is dependent on regardless of already created output.
    
    Default: False