genome_file = Channel.fromPath(params.genome).ifEmpty {error "${params.genome}" is not found}
//genome_file = ${params.genome}
process variant_calling {
tag "variant calling is running on $sample_id"
publishDir "$params.outputDir/vcf_files", mode: "copy"
input:
tuple val (sample_id), path(bam)
file (genome_file)
output:
tuple val(sample_id),path (*.vcf*), emit:variant_file
script:
"""
gatk HaplotypeCaller -R ${genome_file} -I ${bam} -O $sample_id
"""
}
I am new to nextflow, not sure how to resolve this issue. Any help is really appreciated
I think the problem is that the glob pattern in your output declaration just needs to be quoted. However, you may not need the glob at all. If your process just produces a single output file, you can just name it using your sample_id for example:
params.genome = "Homo_sapiens_assembly38.fasta"
params.outputDir = './results'
process variant_calling {
tag { sample_id }
publishDir "${params.outputDir}/vcf_files", mode: "copy"
input:
tuple val(sample_id), path(bam)
path reference
output:
tuple val(sample_id), path("${sample_id}.vcf.gz"), emit: vcf
"""
gatk HaplotypeCaller \\
-R "${reference}" \\
-I "${bam}" \\
-O "${sample_id}.vcf.gz"
"""
}
workflow {
genome_file = file( params.genome )
...
variant_calling( samples_ch, genome_file )
}
Note that genome_file is now a value channel. Most of the time, what you want is one queue channel and one or more value channels when you require multiple input channels.