Search code examples
python-3.xsnakemake

Reuse of a rule in Snakemake


Is there a way to reuse a rule in snakemake changing only the params?

For instance:

rule job1:
    ...
    params:
        reference = "path/to/ref1"
    ...

rule job2:
    input: rules.job1.output
    ...
    params:
        reference = "path/to/ref2"

job1 and job2 rules are doing the same stuff, but I need to call them successively and the reference parameter has to be modified. It generates lot's of code for a very similar task.

I tried to make a sub-workflow for this step, and the main Snakefile is more readable. However, the sub-workflow code still repeated.

Any idea or suggestion? Did I miss something?

EDIT
To be more specific, job2 has to be executed after job1, using the output of the latter.


Solution

  • If the rule is the same, you could just use wildcards in the naming of the output files. This way, the same rule will be executed twice:

    references = ["ref1", "ref2"]
    
    rule all:
      input: expand("my_output_{ref}", ref=references)
    
    rule job:
        input: "my_input"
        output: "my_output_{ref}"
        params: ref = "path/to/{ref}"
        shell: "... {params.ref} {output}"
    

    Hope this helps, if not could you maybe make your question more specific?

    Edit

    Okay, there's the possibility of defining custom inputs for a rule, by using a python function. Maybe you could into this direction. See this working example:

    references = ["ref1", "ref2"]
    
    rule all:
      input: expand("my_output_{ref}", ref=references)
    
    
    def rule_input(wildcards):
        if (wildcards.ref == "ref1"):
            input = "my_first_input"
        elif (wildcards.ref == "ref2"):
            input = "my_output_ref1"
        return(input)
    
    rule job:
        input: rule_input
        output: "my_output_{ref}"
        params: ref = "path/to/{ref}"
        shell: "echo input: {input} ; echo output: {output} ; touch {output}"