Search code examples
moduleprocessdslnextflow

How to reuse the same process twice in within the same module in nextflow dsl2, but saving the output in a different name?


i am building a nextflow workflow (DSL2) and within the same module i need to reuse the same process twice, but the syntax i am using seems to offend nextflow, in the sense that it complains that the process is already been used and i need to alias it properly, the thing is that i wouldn't know how.

Here is the error

Process 'ReUsable' has been already used -- If you need to reuse the same component, include it with a different name or include it in a different workflow context

and here a meaningful example:

process ReUsable {
    publishDir "${params.publish_dir}/VF/SZ/${meta}", mode: 'copy', pattern: '*.out'
    container 'bla.1'
    input:
        tuple val(meta), path(input)
    output:
        tuple val(meta), path('one.out'), emit: out1
    script:
        """
        cat ${input} > one.out
        """
}
process ScndStep {
         publishDir "${params.publish_dir}/VF/${meta}", mode: 'copy', pattern: '*sub.out'
    container 'bla.2'
    input:
        tuple val(meta), path(out1)
    output:
        tuple val(meta), path('two.out'), emit: out2
    script:
        """
        sed -i "s/i/o/g" ${out1} > two.out
        """
}
workflow Unite {
      take:
               input
      main:
               out1 = ReUsable(input)
               out2 = ScndStep(out1)
               out3 = ReUsable(out2)
               .SaveAs{three.out}
      emit:
               out1
               out2
               out3

}

Clearly this is not the correct way, and when i looked a bit around it was recommended to set up a new module to then include the ReUsable twice each time with different name. It seems to me however to overcomplicating things. The whole point here is to make it light and reuse the same process within the same workflow, already having to rewrite a ReUsable2 process is off the scope. Any ideas? Many thanks


Solution

  • Yes - the simplest approach is to use the include keyword for this. Using module aliases lets you reuse components with the same name, for example:

    Contents of main.nf:

    include { ReUsable as ReUsable1 } from './modules/reusable'
    include { ReUsable as ReUsable2 } from './modules/reusable'
    include { ScndStep } from './modules/secondstep'
    
    
    workflow Unite {
    
          take:
          input_ch
    
          main:
          input_ch | ReUsable1 | ScndStep | ReUsable2
    
          emit:
          ReUsable1.out
          ScndStep.out
          ReUsable2.out
    }
    

    Contents of modules/reusable/main.nf:

    process ReUsable {
    
        input:
        tuple val(meta), path(input)
    
        output:
        tuple val(meta), path('one.out'), emit: out1
    
        script:
        """
        cat ${input} > one.out
        """
    }
    

    Contents of modules/secondstep/main.nf:

    process ScndStep {
    
        input:
        tuple val(meta), path(out1)
    
        output:
        tuple val(meta), path('two.out'), emit: out2
    
        script:
        """
        sed "s/i/o/g" ${out1} > two.out
        """
    }
    

    Contents of nextflow.config:

    params {
    
        publish_dir = './results'
    }
    
    process {
    
        withName: ReUsable1 {
    
            publishDir = [
                path: "${params.publish_dir}/ReUsable1",
                mode: 'copy',
            ]
        }
    
        withName: ScndStep {
    
            publishDir = [
                path: "${params.publish_dir}/ScndStep",
                mode: 'copy',
            ]
        }
    
        withName: ReUsable2 {
    
            publishDir = [
                path: "${params.publish_dir}/ReUsable2",
                mode: 'copy',
            ]
        }
    }