Search code examples
snakemake

Snakemake decide which rules to execute during execution


I'm working on a bioinformatics pipeline which must be able to run different rules to produce different outputs based on the contents of an input file:

def foo(file):
 '''
 Function will read the file contents and output a boolean value based on its contents
 '''
# Code to read file here...
return bool

rule check_input:
  input: "input.txt"
  run:
     bool = foo("input.txt")

rule bool_is_True:
  input: "input.txt"
  output: "out1.txt"
  run:
    # Some code to generate out1.txt. This rule is supposed to run only if foo("input.txt") is true

rule bool_is_False:
  input: "input.txt"
  output: "out2.txt"
  run:
    # Some code to generate out2.txt. This rule is supposed to run only if foo("input.txt") is False

How do I write my rules to handle this situation? Also how do I write my first rule all if the output files are unknown before the rule check_input is executed?

Thanks!


Solution

  • You're right, snakemake has to know which files to produce before executing the rules. Therefore, I suggest you use a function which reads what you called "the input file" and define the output of the workflow accordingly.

    ex:

    def getTargetsFromInput():
        targets = list()
        ## read file and add target files to targets
        return targets
    
    rule all:
        input: getTargetsFromInput()
    
    ...
    

    You can define the path of the input file with --config argument on the snakemake command line or directly use some sort of structured input file (yaml, json) and use the keyword configfile: in the Snakefile: https://snakemake.readthedocs.io/en/stable/snakefiles/configuration.html