I have been trying to write a code using BioPerl that will query Genbank for a specific protein and then print the results to a fasta file. So far the code I have works and I can print the results to the screen, but not to a file. I have done lots of researching on the BioPerl website and other sources (CPAN, PerlMonks, etc.) but I have not found anything that can solve my problem. I understand how to read something from a file and then print the output to a new file (using SeqIO), but the problem I am having seems to be that what I want the program to read is not stored in a text or FASTA file, but is the result of a database query. Help? I am very much a beginner, new to Perl/BioPerl and programming in general.
Here is the code I have so far:
#!usr/bin/perl
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;
use Bio::Seq;
$query = "Homo sapiens[ORGN] AND TFII-I[TITL]";
$query_obj = Bio::DB::Query::GenBank->new(-db => 'protein', -query => $query);
$gb_obj = Bio::DB::GenBank->new;
$stream_obj = $gb_obj->get_Stream_by_query($query_obj);
while ($seq_obj = $stream_obj->next_seq)
{print $seq_obj->desc, "\t", $seq_obj->seq, "\n";
}
So, what I want to do in the last line is instead of printing to the screen, print to a file in fasta format.
Thanks, ~Jay
Your code was actually quite close, you are returning a Bio::Seq object in your loop, and you just need to create a Bio::SeqIO object that can handle those objects and write them to a file ("myseqs.fasta" is the file in the example).
#!usr/bin/env perl
use strict;
use warnings;
use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;
use Bio::SeqIO;
my $query = "Homo sapiens[ORGN] AND TFII-I[TITL]";
my $query_obj = Bio::DB::Query::GenBank->new(-db => 'protein', -query => $query);
my $gb_obj = Bio::DB::GenBank->new(-format => 'fasta');
my $stream_obj = $gb_obj->get_Stream_by_query($query_obj);
my $seq_out = Bio::SeqIO->new(-file => ">myseqs.fasta", -format => 'fasta');
while (my $seq_obj = $stream_obj->next_seq) {
$seq_out->write_seq($seq_obj);
}
Also, note that I added use strict;
and use warnings;
to the top of the script. That will help solve most "Why isn't this working?" type of questions by generating diagnostic messages, and it is a good idea to include those lines.