I have a script which takes headers of a multi-fasta file and pushes them into an array. Then I want to loop through this array to find a specific pattern and perform some commands.
open(FH, '<', $ref_seq) or die $!;
while(<FH>){
$line = $_;
chomp $line;
if(m/^>([^\s]+)/){
$ref_header = $1;
print "$ref_header\n";
chomp $header;
if($1 eq $header){
$ref_header = $header;
#print "header is $ref_header\n";
}
}
}
This code prints headers like
chr1
chr2
chr3
How can I push these headers into an array?
I tried following code, but it splits individual letters, instead of $header_array[0]
being chr1
@header_array = split(/\n*/, $ref_header);
print ("Here's the first element $header_array[0]");
Any help will be appreciated.
Shorten the code as shown below, removing some extra statements, and use push
. You can combine push
and the pattern match:
#!/usr/bin/env perl
use strict;
use warnings;
use Carp;
my $in_file = shift;
my @headers;
open my $in_fh, '<', $in_file or croak "cannot open $in_file: $!";
while ( <$in_fh> ) {
push @headers, />(\S+)/;
}
close $in_fh or croak "cannot close $in_file: $!";
print "@headers";
# Now, loop through headers and select the ones you need, for example:
for my $header ( @headers ) {
if ( $header =~ /foo/ ) {
# do something
}
}
A few suggestion on fixing your original code are below:
# Always use strict and use warnings.
# Remove extra parens and make the error message more informative:
open(FH, '<', $ref_seq) or die $!;
while(<FH>){
$line = $_;
chomp $line;
# [^\s] is simply \S:
if(m/^>([^\s]+)/){
$ref_header = $1;
print "$ref_header\n";
# where is $header coming from?
chomp $header;
# if the condition is satisfied, this assignment does not make sense:
# $ref_header is already the same as $header:
if($1 eq $header){
$ref_header = $header;
#print "header is $ref_header\n";
}
}
}