Search code examples
pythonperlsnakemake

Snakemake: inserting sample name before every input file in one rule


I am trying to create a rules file for a bioinformatics tool FMAP. https://github.com/jiwoongbio/FMAP

I am stuck at creating a rule for the FMAP_table.pl script. This is my current rule:

rule fmap_table:
    input:
        expand(str(CLASSIFY_FP/"mapping"/"{sample}_abundance.txt"), sample=Samples.keys())
    output:
        str(CLASSIFY_FP/'mapping'/'abundance_table.txt')
    shell:
        """
        perl /media/data/FMAP/FMAP_table.pl {input} > {output}
        """

I would like my column names to contain only the sample names, not the whole path. This can be done in the script like this

perl FMAP_table.pl [options] [name1=]abundance1.txt [[name2=]abundance2.txt [...]] > abundance_table.txt 

My issue is that how do I select the sample name for each sample file, the path of the sample and add the = in between.

My samples are named like this SAMPLE111_S1_abundance.txt This is the format I would like to achieve automatically:

perl /media/data/FMAP/FMAP_table.pl SAMPLE111_S1 = SAMPLE111_S1_abundance.txt SAMPLE112_S2 = SAMPLE112_S2.abundance.txt [etc.] > abundance.txt"

Thanks


Solution

  • I might add a parameter to build that, and maybe also build the file names in dict externally:

    FMAP_INPUTS = {sample: str(CLASSIFY_FP/"mapping"/"{sample}_abundance.txt")
                   for sample in Samples.keys()}
    
    rule fmap:
        input: FMAP_INPUTS.values()
        output:
            str(CLASSIFY_FP/'mapping'/'abundance_table.txt')
        params:
            names=" ".join(f"{s}={f}" for s,f in FMAP_INPUTS.items())
        shell:
            """
            perl /media/data/FMAP/FMAP_table.pl {params.names} > {output}
            """