Search code examples
pythonshellyamlsnakemakevariable-expansion

Correctly consuming a multiline config file in snakemake as an input


For various reasons I would like to be able to define my inputs in a separate config file. My current version without using a config file looks like:

rule test:
   input:
     labs = "data/labs.csv"
     demo = "data/demo.csv"
   output:
     "outputs/output.txt"
   script:
     "programs/myprogram.py"

Instead of this I would like my config file to be something like:

{
 "inputs": {
        "labs" : "data/labs.csv",
         "demo": "data/demo.csv"
  }
}

And then my snakemake file would be:

rule test:
   input:
     config["inputs"]
   output:
     "outputs/output.txt"
   script:
     "programs/myprogram.py"

However, I get an error telling me that I have missing input files for the rule, with note of affected files labs and demo.

I imagine I could parse this into a list that perhaps inputs could understand, but I would like my inputs to ideally retain their names. Unfortunately it is not at all clear to me how to achieve this.


Solution

  • yaml might be a better choice for formatting the config since it's more readable. Let's pretend we have config.yml containing:

    inputs:
       labs: data/labs.csv
       demo: data/demo.csv
    

    We can load this yaml using configfile. Now, make sure to use **config["inputs"] as this will expand the contents of the dictionary and pass it as key=value combinations:

    configfile: "config.yml"
    
    rule test:
        input: **config["inputs"] 
        shell: 'echo {input}'