Search code examples
snakemake

Can I stop a rule in snakefile being parallel executed


I tried to concatenate files created via snakemake workflow as the last rule. To separate and identify the contents of each file, I echo each file name first in the shell as a separation tag (see the code below)

rule cat:
    input:
        expand('Analysis/typing/{sample}_type.txt', sample=samples)        
    output:            
        'Analysis/typing/Sum_type.txt'        
    shell:            
        'echo {input} >> {output} && cat {input} >> {output}'

I was looking for the result as this format:

file name of sample 1

content of sample 1

file name of sample 2

content of sample 2

instead I got this format:

file name of sample 1 file name of sample 2 ...

content of sample 1 content of sample 2 ...

It seems snakemake execute echo command in parallel first then execute the cat command. What can I do the get the format I wanted?

Thanks


Solution

  • The output you get is consistent with the way bash works rather than with snakemake. Anyway, I think the snakemake way of doing it would be a rule to add the filename to the file content of each file and a rule to concatenate the output. E.g. (not checked for errors):

    rule cat:
        input:
            'Analysis/typing/{sample}_type.txt',
        output:
            temp('Analysis/typing/{sample}_type.txt.out'),
        shell:
            r"""
            echo {input} > {output}
            cat {input} >> {output}
            """
            
    rule cat_all:
        input:
            expand('Analysis/typing/{sample}_type.txt.out', sample=samples)        
        output:            
            'Analysis/typing/Sum_type.txt'        
        shell:            
            r"""
            cat {input} > {output}
            """