I am trying to create a rules file for a bioinformatics tool FMAP. https://github.com/jiwoongbio/FMAP
I am stuck at creating a rule for the FMAP_table.pl
script. This is my current rule:
rule fmap_table:
input:
expand(str(CLASSIFY_FP/"mapping"/"{sample}_abundance.txt"), sample=Samples.keys())
output:
str(CLASSIFY_FP/'mapping'/'abundance_table.txt')
shell:
"""
perl /media/data/FMAP/FMAP_table.pl {input} > {output}
"""
I would like my column names to contain only the sample names, not the whole path. This can be done in the script like this
perl FMAP_table.pl [options] [name1=]abundance1.txt [[name2=]abundance2.txt [...]] > abundance_table.txt
My issue is that how do I select the sample name for each sample file, the path of the sample and add the = in between.
My samples are named like this SAMPLE111_S1_abundance.txt This is the format I would like to achieve automatically:
perl /media/data/FMAP/FMAP_table.pl SAMPLE111_S1 = SAMPLE111_S1_abundance.txt SAMPLE112_S2 = SAMPLE112_S2.abundance.txt [etc.] > abundance.txt"
Thanks
I might add a parameter to build that, and maybe also build the file names in dict externally:
FMAP_INPUTS = {sample: str(CLASSIFY_FP/"mapping"/"{sample}_abundance.txt")
for sample in Samples.keys()}
rule fmap:
input: FMAP_INPUTS.values()
output:
str(CLASSIFY_FP/'mapping'/'abundance_table.txt')
params:
names=" ".join(f"{s}={f}" for s,f in FMAP_INPUTS.items())
shell:
"""
perl /media/data/FMAP/FMAP_table.pl {params.names} > {output}
"""