I have a phylogenetic tree I am loading in BioPerl. I want to use a custom sort function ("$depth_sort" below) to order the nodes vertically when printing out the tree. However the documented method used below writes out the tree in the same order it was read in, without actually sorting:
use strict;
use Bio::TreeIO;
my $input = new Bio::TreeIO(-file=>"input.tree", -format=>'newick');
my $tree = $input->next_tree;
my $depth_sort = sub { $a->depth <=> $b->depth };
#this is the way it is supposed to work:
my $out = new Bio::TreeIO(
-file => ">sorted.tree",
-format => 'newick',
-order_by => $depth_sort
);
$out->write_tree($tree);
Instead, I can manually iterate through the tree to see what is happening. If I do my own sort locally, it works as expected:
my $rnode = $tree->get_root_node;
&printNode($rnode);
sub printNode {
my $thisNode = shift;
#print the node identity and depth, just to keep track:
print "this is " . $thisNode->internal_id .
" with depth " . $thisNode->depth . "\n";
if (! $thisNode->is_Leaf) {
my @children = sort $depth_sort $thisNode->each_Descendent();
for my $otherNode (@children) { &printNode($otherNode); }
}
}
But, if I pass the custom sort to each_Descendent (the way the write call above is supposed to):
my @children = $thisNode->each_Descendent($depth_sort);
Then it dies with the message:
Can't call method "depth" on an undefined value at treeFlipper.pl line 7, line 1.
I found another thread here that I think has me on the right track, but I haven't solved it yet: Perl sorting; dealing with $a, $b package globals across namespaces cleanly
Switching to the prototyping method in the first answer:
my $depth_sort = sub ($$) {
my ($a1,$b1) = @_;
return ($a1->depth <=> $b1->depth) };
gives me:
Can't call method "depth" on unblessed reference at treeFlipper.pl line 9, line 1.
But if I check the ref type, that seems correct:
print ref($a1) . "\n";
gives
Bio::Tree::Node
When I try force-blessing the reference (this is probably a terrible thing to do):
bless $a1, 'Bio::Tree::Node';
bless $b1, 'Bio::Tree::Node';
I get:
Not a HASH reference at /usr/local/share/perl/5.14.2/Bio/Tree/Node.pm line 433, line 1.
The other methods in that thread (using caller) give me the same old "undefined value" error.
Is this a problem with BioPerl, or (as I suspect) am I still missing something? Thanks!
edit: using Perl 5.14.2 on Ubuntu 12.04 LTS
There turned out to be three parts to this:
The sort function does need to be prototyped and indexed directly OR accessed through the calling package, as documented (for instance) in the other thread I linked to.
Why it still didn't work after fixing the sort function is a bit of a mystery, but is probably non-reproducible. It was solved by reinstalling BioPerl.
Finally, it turns out that the current version of BioPerl (I have 1.6.9) doesn't fully implement the "order_by" parameter --the Bio::TreeIO::newick constructor does not set the passed-in value like it should. However, I was able to get around this by explicitly setting the parameter after construction:
my $out = new Bio::TreeIO(
-file => ">sorted.tree",
-format => 'newick',
);
$out->get_params->{'order_by'}=$depth_sort;
$out->write_tree($tree);
This worked as expected.