Search code examples
snakemake

What is the proper way to access values in nested dictionaries with Snakemake?


I often want to process several entities in my analysis. I describe them the following way in my config.yml file:

entities:
    fbr:
        full_name: "FUBAR"
        filepath: "data/junk_1.txt"
    snf:
        full_name: "SNAFU"
        filepath: "data/junk_2.txt"

What is the idiomatic way to pass file paths as an input to a rule? I know that I can do it with a list comprehension, here is a toy Snakefile:

configfile: "config.yml"

rule all:
    input:
        "data/result.txt"

rule first:
    input:
        expand("{entity}", entity=[x["filepath"] for x in config["entities"].values()])
    output:
        "data/result.txt"
    shell:
        "cat {input} > {output}"

This approach works but feels cumbersome and ugly. Is there a better way to perform expansion with nested dictionaries in Snakemake?


Solution

  • The expand(...) is superfluous and can be left out:

    configfile: "config.yml"
    
    rule all:
        input:
            "data/result.txt"
    
    rule first:
        input:
            [x["filepath"] for x in config["entities"].values()]
        output:
            "data/result.txt"
        shell:
            "cat {input} > {output}"
    

    Given the nested dictionary structure of your config.yml, there are not a lot of better alternatives to substitute the list comprehension.