I'm new to snakemake and I don't know how to figure out this problem.
I've got my rule which has two inputs:
rule test
input_file1=f1
input_file2=f2
f1 is in [A{1}$, A{2}£, B{1}€, B{2}¥]
f2 is in [C{1}, C{2}]
The numbers are wildcards that come from an expand call. I need to find a way to pass to the file f1 and f2 a pair of files that match exactly with the number. For example:
f1 = A1
f2 = C1
or
f1 = B1
f2 = C1
I have to avoid combinations such as:
f1 = A1
f2 = C2
I would create a function that makes this kind of matches between the files, but the same should manage the input_file1 and the input_file2 at the same time. I thought to make a function that creates a dictionary with the different allowed combinations but how would I "iterate" over it during the expand?
Thanks
Assuming rule test
gives you in output a file named {f1}.{f2}.txt
, then you need some mechanism that correctly pairs f1 and f2 and create a list of {f1}.{f2}.txt
files.
How you create this list is up to you, expand
is just a convenience function for that but maybe in this case you may want to avoid it.
Here's a super simple example:
fin1 = ['A1$', 'A2£', 'B1€', 'B2¥']
fin2 = ['C1', 'C2']
outfiles = []
for x in fin1:
for y in fin2:
## Here you pair f1 and f2. This is a very trivial way of doing it:
if y[1] in x:
outfiles.append('%s.%s.txt' % (x, y))
wildcard_constraints:
f1 = '|'.join([re.escape(x) for x in fin1]),
f2 = '|'.join([re.escape(x) for x in fin2]),
rule all:
input:
outfiles,
rule test:
input:
input_f1 = '{f1}.txt',
input_f2 = '{f2}.txt',
output:
'{f1}.{f2}.txt',
shell:
r"""
cat {input} > {output}
"""
This pipeline will execute the following commands
cat A2£.txt C2.txt > A2£.C2.txt
cat A1$.txt C1.txt > A1$.C1.txt
cat B1€.txt C1.txt > B1€.C1.txt
cat B2¥.txt C2.txt > B2¥.C2.txt
If you touch the starting input files with touch 'A1$.txt' 'A2£.txt' 'B1€.txt' 'B2¥.txt' 'C1.txt' 'C2.txt'
you should be able to run this example.