I have very recently started using Snakemake.
What I am trying to achieve is this: I have some samples
which need to be modified, independently (rule index_vcf
). Then, groups of samples are the input of another rule (analyse
), and we get an output for each group in groups
.
I would like the second rule to run two commands:
some_script --input_files 'A2 Ah AL' --output ../out/A.out
and
some_script --input_files 'Banana BLM' --output ../out/B.out
I know how to do it if it's just for one group, but if I do it for both, then the wildcard sample_from_group
which I am expanding in analyse
needs to depend on the group and I get the error
unhashable type: 'list'
This is my config file:
groups:
- A
- B
samples:
- A2
- Ah
- AL
- Banana
- BLM
grouped_samples:
A: A2_mod, Ah_mod, AL_mod
B: Banana_mod, BLM_mod
and this is my Snakefile
configfile: "config_PCAWG.yaml"
samples = config["samples"]
groups = config["groups"]
grouped_samples = config["grouped_samples"]
rule all:
input:
expand("../out/{group}.out", group = groups)
rule index_vcf:
input:
"../data/{sample}"
output:
"../data/{sample}_mod"
shell:
"tabix -f {input}"
rule analyse:
input:
expand("{sample_from_group}", sample_from_group=grouped_samples[{group}].split())
output:
"../out/{group}.out
shell:
"some_script --input_files '{input}' --output {output}"
Firstly you need to correct misprints (e.g. there is a string that has no closeing quote). Next, there is a logical error in that no rule produces anything that matches the pattern "../out/{group}.out"
. Did you mean "../data/{group}.out"
?
Now the main part. This is an invalid syntax:
expand("{sample_from_group}", sample_from_group=grouped_samples[{group}].split())
What you meant is a lambda (or a function) that takes a wildcard and produces an expand:
rule analyse:
input:
lambda wildcards: expand("{sample_from_group}",
sample_from_group=grouped_samples[wildcards.group].split())