I'm a newbie Nexflow user. And I'm struggling to familiarize input/output jacks in Nexflow. I knew that Nextflow has DAG visualisation, a useful feature for drawing a directed chart for flow.
I have a silly small chart like this.
I want to write a Nextflow file for the upper pipeline. Especially, I expect that the outputs of process A can be jacked on processes B and C in a particular way. Outputs name be shown off in the output flowchart (when run with tag -with-dag
).
If someone helps me, I'll very much appreciate it. Thanks.
Complement
This is my script. At my level, I just only can use path
as my output. This leads my script more verbose because of the paths of the file. Above all, when using the draw chart feature, the output isn't clear as I expected like the initial flowchart.
#!/usr/bin/env nextflow
params.input_text = "abc"
process A{
input:
val text
output:
path A_folder
"""
mkdir A_folder
string=$text
for element in \$(seq 0 \$((\${#string}-1)))
do
echo \${string:\$element:1} > A_folder/\$element.txt
done
"""
}
process B{
input:
path A_folder
output:
path B_folder
"""
mkdir B_folder
echo \$(cat $A_folder/0.txt)\$(cat $A_folder/1.txt) > B_folder/glue1.txt
"""
}
process C{
input:
path A_folder
output:
path C_folder
"""
mkdir C_folder
echo \$(cat $A_folder/2.txt | sed 's/c/3/g') > C_folder/tras.txt
"""
}
process D{
input:
path B_folder
path C_folder
output:
path D_folder
"""
mkdir D_folder
echo \$(cat $C_folder/tras.txt)\$(cat $B_folder/glue1.txt) > D_folder/glue2.txt
"""
}
workflow{
process_A = A(params.input_text)
process_B = B(process_A)
process_C = C(process_A)
process_D = D(process_B, process_C)
}
Summarily, my question is "After writing the code and running the script (nextflow run script.nf -with-dag flow.png
) . How to get the flowchart as similar to the first chart as possible?"
As of version 22.04.0, Nextflow can do DAG visualisation using the Mermaid renderer. All you need to do is change the output file extension to mmd
, for example:
nextflow run main.nf -with-dag flow.mmd
And we can simplify the workflow a bit by using native-execution and to get close to the desired result:
params.input_text = "abc"
process process_A {
input:
val text
output:
val a, emit: foo
val b, emit: bar
val c, emit: baz
exec:
(a, b, c) = text.collect()
}
process process_B {
input:
val x
val y
output:
val z
exec:
z = x + y
}
process process_C {
input:
val a
output:
val b
exec:
b = a.replaceAll('c', '3')
}
process process_D {
input:
val one
val two
output:
val three
exec:
three = two + one
}
workflow {
entry_input = Channel.of( params.input_text )
(output_1, output_2, output_3) = process_A(entry_input)
(output_4) = process_B( output_1, output_2 )
(output_5) = process_C( output_3 )
(final_output) = process_D( output_4, output_5 )
final_output.view()
}
Results:
$ nextflow run main.nf -with-dag flow.mmd
N E X T F L O W ~ version 23.04.1
Launching `main.nf` [distraught_lorenz] DSL2 - revision: a1f4411ded
executor > local (4)
[a7/e97d7c] process > process_A (1) [100%] 1 of 1 ✔
[53/317d41] process > process_B (1) [100%] 1 of 1 ✔
[b2/88be6d] process > process_C (1) [100%] 1 of 1 ✔
[39/38f318] process > process_D (1) [100%] 1 of 1 ✔
3ab
$ cat flow.mmd
flowchart TD
p0((Channel.of))
p1[process_A]
p2[process_B]
p3[process_C]
p4[process_D]
p5([view])
p6(( ))
p0 -->|entry_input| p1
p1 -->|output_1| p2
p1 -->|output_2| p2
p1 -->|output_3| p3
p2 -->|output_4| p4
p3 -->|output_5| p4
p4 -->|final_output| p5
p5 --> p6
We can then produce an image with the Mermaid Live Editor and the 'default' theme:
Additional thoughts:
Using parentheses around the channel declarations in the workflow block seems to prevent it from using the output channel names defined in the process blocks. Under the old DSL, the output of process_D (val three
) was just shorthand for val three into three
. Under DSL2, it appears the output channels still get named the same way but of course we no longer need the into
keyword.