I have a Snakemake pipeline where I get my input/output paths for my file folders from a json file and use the expand function to obtain the paths.
import json
with open('config.json', 'r') as f:
config = json.load(f)
wildcard = ["1234", "5678"]
rule them_all:
input:
expand('config["data_input"]/data_{wc}.tab', wc = wildcard)
output:
expand('config["data_output"]/output_{wc}.rda', wc = wildcard)
shell:
"Rscript ./my_script.R"
My config.json
is
{
"data_input": "/very/long/path",
"data_output": "/slightly/different/long/path"
}
While trying to make a dry run, though, I get the following error:
$ snakemake -np
Building DAG of jobs...
MissingInputException in line 12 of /path/to/Snakefile:
Missing input files for rule them_all:
config["data_input"]/data_1234.tab
config["data_input"]/data_5678.tab
The files are there and their path is /very/long/path/data_1234.tab
.
This is probably a low-hanging fruit, but what am I doing wrong in the syntax for the expansion? Or is it the way I call the json file?
expand()
does not interpret access to dictionaries for its first argument while expanding the path with quotation marks, so this operation with expand()
has to be done in a wildcard.
The correct syntax, in this case, would be e.g.
expand('{input_folder}/data_{wc}.tab', wc = wildcard, input_folder = config["data_input"])