Search code examples
snakemake

snakemake: missing last rule by changing rull all command lines


below is my snakemake codes, if i don't comment out line28,29 code,which is rule all->input->the 1st,2nd command lines,then I can't get the last rule varscan_somatic, that is to say,the dry run output likes this:

Job counts:
        count   jobs
        1       all
        25      mpileup_analysis
        25      normal_mpileup
        25      tumor_mpileup
        76
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.

but if i do comment out line28,29 code,which is rule all->input->the 1st,2nd command lines,then I can get the last rule varscan_somatic, that is to say,the dry run output likes this:

Job counts:
        count   jobs
        1       all
        25      mpileup_analysis
        25      normal_mpileup
        25      tumor_mpileup
        25      varscan_somatic
        101
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.

I don't know why it does happen? Can anybody give me some advise?thanks a lot for any help.

import re
import os

mydict = dict()
with open("config.txt") as HD:
    for line in HD:
        line = line.rstrip()
        if line.startswith("#"):
            continue
        value,field = re.split("\s*=\s*",line)
        mydict[value] = field

VarScan    = mydict['VarScan']
SAMtools   = mydict['SAMtools']
REFERENCE  = mydict['REFERENCE']
PL_MPPANA  = mydict['MPPANA']

chrlist = [i for i in os.listdir("call_region") if i.endswith(".region.bed")]
def replace(str1):
    str2 = str1.replace(".region.bed","")
    return str2
chrlist = map(replace,chrlist)

configfile:"paired_test.yaml"

rule all:
    input:
#        expand("varscan_somatic/{sample}/{chrid}.normal.mpileup.analysis",sample=config['samples'],chrid=chrlist),
#        expand("varscan_somatic/{sample}/{chrid}.tumor.mpileup.analysis",sample=config['samples'],chrid=chrlist),
        expand("varscan_somatic/{sample}/{chrid}.snp.vcf",sample=config['samples'],chrid=chrlist),
        expand("varscan_somatic/{sample}/{chrid}.indel.vcf",sample=config['samples'],chrid=chrlist)

rule normal_mpileup:
    input:
        bam=lambda wc:config['samples'][wc.sample][3],
        bed="call_region/{chrid}.region.bed"
    output:
        "varscan_somatic/{sample}/{chrid}.normal.mpileup"
    log:
        "log/varscan_somatic/{sample}/{chrid}.normal.mpileup.log"
    shell:
        "{SAMtools} mpileup -f {REFERENCE} -l {input.bed} "
        "{input.bam} -d1000 -Q10 -q10 -o {output} "
        "1>{log} 2>&1"

rule tumor_mpileup:
    input:
        bam=lambda wc:config['samples'][wc.sample][1],
        bed="call_region/{chrid}.region.bed"
    output:
        "varscan_somatic/{sample}/{chrid}.tumor.mpileup"
    log:
        "log/varscan_somatic/{sample}/{chrid}.tumor.mpileup.log"
    shell:
        "{SAMtools} mpileup -f {REFERENCE} -l {input.bed} "
        "{input.bam} -d1000 -Q10 -q10 -o {output} "
        "1>{log} 2>&1"

rule mpileup_analysis:
    input:
        tumor="varscan_somatic/{sample}/{chrid}.tumor.mpileup",
        norml="varscan_somatic/{sample}/{chrid}.normal.mpileup"
    output:
        tumor="varscan_somatic/{sample}/{chrid}.tumor.mpileup.analysis",
        norml="varscan_somatic/{sample}/{chrid}.normal.mpileup.analysis"
    log:
        "log/varscan_somatic/{sample}/{chrid}.mpileup_analysis.log"
    shell:
        "{PL_MPPANA} {input.tumor} {output.tumor} 1>{log} 2>&1 "
        "&& {PL_MPPANA} {input.norml} {output.norml} 1>>{log} 2>&1"

rule varscan_somatic:
    input:
        tumor="varscan_somatic/{sample}/{chrid}.tumor.mpileup",
        norml="varscan_somatic/{sample}/{chrid}.normal.mpileup",
        temp1="varscan_somatic/{sample}/{chrid}.tumor.mpileup.analysis",
        temp2="varscan_somatic/{sample}/{chrid}.normal.mpileup.analysis"
    output:
        "varscan_somatic/{sample}/{chrid}.snp",
        "varscan_somatic/{sample}/{chrid}.indel"
    log:
        "log/varscan_somatic/{sample}/{chrid}.varscan_somatic.log"
    params:
        "varscan_somatic/{sample}/{chrid}",
        "--validation 1 --output-vcf 1"
    shell:
        "{VarScan} somatic {input.tumor} {input.norml} {params} 1>{log} 2>&1"
>>>config.txt
VarScan = /path/my/varscan
REFERENCE = /path/my/hg19.fa
SAMtools = /path/my/samtools
MPPANA = /path/my/mppana.pl
>>> paired.yaml
samples:
    S01:['S01','S01.bqsr.bam','S02','S02.bqsr.bam']

below is: snakemake summary without comments

snakemake -s bin/varscan_somatic_paired.py -np --forceall --summary
Building DAG of jobs...
output_file     date    rule    version log-file(s)     status  plan
varscan_somatic/S001/chr21.tumor.mpileup.analysis       Thu Sep 26 02:34:56 2019        mpileup_analysis        -       log/varscan_somatic/S001/chr21.mpileup_analysis.log ok      update pending
varscan_somatic/S001/chr21.normal.mpileup.analysis      Thu Sep 26 02:34:56 2019        mpileup_analysis        -       log/varscan_somatic/S001/chr21.mpileup_analysis.log ok      update pending
varscan_somatic/S001/chr10.tumor.mpileup.analysis       Wed Sep 25 22:56:59 2019        mpileup_analysis        -       log/varscan_somatic/S001/chr10.mpileup_analysis.log ok      update pending

below is: snakemake summary with comments

Building DAG of jobs...
output_file     date    rule    version log-file(s)     status  plan
varscan_somatic/S001/chr19.snp.vcf      Wed Sep 25 22:14:13 2019        varscan_somatic -       log/varscan_somatic/S001/chr19.varscan_somatic.log ok       update pending
varscan_somatic/S001/chr19.indel.vcf    Wed Sep 25 22:14:13 2019        varscan_somatic -       log/varscan_somatic/S001/chr19.varscan_somatic.log ok       update pending
varscan_somatic/S001/chr14.snp.vcf      Thu Sep 26 01:22:17 2019        varscan_somatic -       log/varscan_somatic/S001/chr14.varscan_somatic.log ok       update pending

enter image description here

enter image description here


Solution

  • This code is wrong: before fix:

    def replace(str1):
        str2 = str1.replace(".region.bed","")
        return str2
    chrlist = map(replace,chrlist)
    

    after fix:

    def replace(str1):
        str2 = str1.replace(".region.bed","")
        return str2
    chrlist = list(map(replace,chrlist))
    

    then everything is ok.