I would like to perform an R script multiple times on different input files with the help of snakemake. To do this I tried the use of the expand function.
I am relatively new to snakemake and when I understand it correctly, the expand function gives me for example multiple input files which are then all concatenated and available via {input}
.
Is it possible to call the shell command on the files one by one?
Lets say I have this definition in my config.yaml:
types:
- "A"
- "B"
This would be my example rule:
rule manual_groups:
input:
expand("chip_{type}.bed",type=config["types"])
output:
expand("data/p_chip_{type}.model",type=config["types"])
shell:
"Rscript scripts/pre_process.R {input}"
This would lead to the command:
Rscript scripts/pre_process.R chip_A.bed chip_B.bed
Is it possible to instead call the command two times independently with two types like this:
Rscript scripts/pre_process.R chip_A.bed
Rscript scripts/pre_process.R chip_B.bed
Thank you for any help in advance!
Define final target files in rule all
, and then just use appropriate wildcard (i.e., type
) in rule manual_groups
. This would run rule manual_groups
separately for each output file listed in rule all
.
rule all:
input:
expand("data/p_chip_{type}.model",type=config["types"])
rule manual_groups:
input:
"chip_{type}.bed"
output:
"data/p_chip_{type}.model"
shell:
"Rscript scripts/pre_process.R {input}"
PS- You may want to change wildcard term type
because of potential conflict with Python's type method.