Search code examples
snakemake

Snakefile executes only one job/rule


The snakefile consists of two jobs - one downloads genome, the other uses bowtie2 to build a bowtie2 index from the resulting .fa file. The code is below:

rule reference_genome_download:
    output:
      "reference_genome/ref_genome.fna"
    shell:
      """
      gca="GCA_000372685.2"
      datasets download genome accession $gca --exclude-gff3 --exclude-protein --exclude-rna
      unzip ncbi_dataset.zip
      cat $(ls ncbi_dataset/data/$gca/chr*) > {output}
      rm -r ncbi_dataset
      rm README.md
      rm ncbi_dataset.zip
      """
rule build_bowtie_index:
    input:
      "reference_genome/ref_genome.fna"
    output:
      "reference_genome/btbuild.log"
    shell:
      "bowtie2-build {input} reference_genome/ref_genome_btindex > {output}"

When I dry run it with snakemake -n -c 10 I get the following:

Building DAG of jobs...
Job stats:
job                          count    min threads    max threads
-------------------------  -------  -------------  -------------
reference_genome_download        1              1              1
total                            1              1              1


[Fri Jan 28 12:54:25 2022]
rule reference_genome_download:
    output: reference_genome/ref_genome.fna
    jobid: 0
    resources: tmpdir=/tmp

The rule build_bowtie_index doesn't even appear as a job option. How do I get the two to link?


Solution

  • With snakemake, I find more useful to think in terms of what I want at the end rather than in terms of a sequence of jobs. Snakemake looks at the first rule to establish what the user wants to produce and then uses the following rules to produce that output.

    In your case, the first rule should be something like:

    rule all:
        input:
          "reference_genome/btbuild.log",
    

    meaning that at the end you want reference_genome/btbuild.log. Snakemake will figure out that to produce reference_genome/btbuild.log it needs to run reference_genome_download first and then build_bowtie_index. In fact, the order of the rules after the first doesn't even matter, snakemake will combine them in the right order by itself.