I am using some python scripts with snakemake to automate the workflow. These scripts take in command line arguments and, while I could replace them with snakemake.input[0]
, snakemake.output[0]
, etc, I am reluctant to do so since I'd also like to be able to use them outside of snakemake.
One natural way to solve this problem -- what I have been doing -- is to run them as a shell
and not a script
. However, when I do this the dependency graph is broken; I update my script and the DAG doesn't think anything needs to be re-run.
Is there a way to pass command line arguments to scripts but still run them as a script?
Edit: an example
My python script
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-o", type=str)
args = parser.parse_args()
with open(args.o, "w") as file:
file.write("My file output")
My Snakefile
rule some_rule:
output: "some_file_name.txt"
shell: "python my_script.py -o {output}"
Based on comment from @troy-comi, I've been doing the following which -- while a bit of a hack -- does exactly what I want. I define the script as an input to the snakemake rule, which actually can help readability as well. A typical rule (this is not a full MWE) might look like
rule some_rule:
input:
files=expand("path_to_files/f", f=config["my_files"]),
script="scripts/do_something.py"
output: "path/to/my/output.txt"
shell: "python {input.script} -i {input.files} -o {output}"
when I modify the scripts, it triggers a re-run; it's readable; and it doesn't require me to insert snakemake.output[0]
in my python scripts (making them hard to recycle outside this workflow).