Search code examples
perlarraysiobioperl

Is there a Bioperl equivalent of IO::ScalarArray for array of Seq objects?


In Perl, we have IO::ScalarArray for treating the elements of an array like the lines of a file. In BioPerl, we have Bio::SeqIO, which can produce a filehandle that reads and writes Bio::Seq objects instead of strings representing lines of text. I would like to do a combination of the two: I would like to obtain a handle that reads successive Bio::Seq objects from an array of such objects. Is there any way to do this? Would it be trivial for me to implement a module that does this?

My reason for wanting this is that I would like to be able to write a subroutine that accepts either a Bio::SeqIO handle or an array of Bio::Seq objects, and I'd like to avoid writing separate loops based on what kind of input I get. Perhaps the following would be better than writing my own IO module?

sub process_sequences {
    my $input = $_[0];

    # read either from array of Bio::Seq or from Bio::SeqIO
    my $nextseq;
    if (ref $input eq 'ARRAY') {
        my $pos = 0
        $nextseq = sub { return $input->[$pos++] if $pos < @$input}; }
    }
    else {
        $nextseq = sub { $input->getline(); }
    }

    while (my $seq = $nextseq->()) {
        do_cool_stuff_with($seq)
    }
}

Solution

  • Your solution looks like it should work. Unless you really want to spend a lot of time solving this problem, go with it until you don't like it anymore. I might have written that like so to avoid typing the variable name several times:

    my $nextseq = do {
         if (ref $input eq ref [] ) {
             my $pos = 0;  #maybe a state variable if you have Perl 5.10
             sub { return $input->[$pos++] if $pos < @$input} }
             }
         else {
             sub { $input->getline() }
         }
     }
    

    If you're interested in iterators, though, check out Mark Jason Dominus's Higher Order Perl, where he talks about all sorts of ways to do these sorts of things.