Search code examples
nextflow

Can't get fromFilePairs channel factory to work properly


I'm trying to use the fromFilePairs channel factory in my workflow, but it seems like it keeps emitting individual files instead of pairs. I'm trying to get it working with just two files, fileA-1.txt and fileA-2.txt.

This is the code I have:

workflow test {
    paired_files = channel.fromFilePairs(file("*{1,2}.txt")).view()
}

Which produces the output:

[fileA-2, [fileA-2.txt]]
[fileA-1, [fileA-1.txt]]

I would expect it to produce the following output:

[fileA-, [fileA-1.txt, fileA-2.txt]]

Any help would be appreciated.


Solution

  • The file() method actually returns a list object when it's argument contains wildcard characters. Instead, simply supply a string to the fromFilePairs method, for example:

    params.input = "file-{1,2}.txt"
    
    workflow {
    
        Channel.fromFilePairs( params.input ).view()
    }
    

    Results:

    [file, [/path/to/file-1.txt, /path/to/file-2.txt]]
    


    If you need to use a more complex pattern with the fromFilePairs method, the trick is to ensure that the pattern you provide can return a common prefix in order for the files to be grouped as expected. An easy way to check to see if this is working as expected is to test directly using the readPrefix helper function. For example:

    test_bam = file('/path/to/sample.1.bam')
    test_bai = file('/path/to/sample.1.bam.bai')
    
    pattern = '/path/to/sample*.{bam,bai}'
    
    println("My BAM prefix is: ${Channel.readPrefix(test_bam, pattern)}")
    println("My BAI prefix is: ${Channel.readPrefix(test_bai, pattern)}")
    

    You can see that these results do not produce a common prefix:

    $ nextflow run main.nf 
    
     N E X T F L O W   ~  version 24.04.4
    
    Launching `main.nf` [loving_liskov] DSL2 - revision: 85fd0722f7
    
    My BAM prefix is: sample.1
    My BAI prefix is: sample.1.bam
    

    However, if you change your pattern definition to:

    pattern = '/path/to/sample*.bam{,.bai}'
    

    We can check that it returns a common prefix:

    $ nextflow run main.nf 
    
     N E X T F L O W   ~  version 24.04.4
    
    Launching `main.nf` [festering_sammet] DSL2 - revision: 4f49fab742
    
    My BAM prefix is: sample.1
    My BAI prefix is: sample.1