Search code examples
perlvariablescoding-stylestring-matchingslash

what does these perl variables mean?


I'm a little noobish to perl coding conventions, could someone help explain:

  • why are there / and /< in front of perl variables?
  • what does\= and =~ mean, and what is the difference?
  • why does the code require an ending / before the ;, e.g. /start=\'([0-9]+)\'/?

The 1st 3 sub-questions were sort of solved by really the perldocs, but what does the following line means in the code?

push(@{$Start{$start}},$features);

i understand that we are pushing the $features into a @Start array but what does @$Start{$start} mean? Is it the same as: @Start = ($start);

Within the code there is something like this:

use FileHandle;

sub open_infile {
  my $file = shift;
  my $in = FileHandle->new($file,"<:encoding(UTF-8)")
      or die "ERROR: cannot open $file: $!\n" if ($Opt_utf8);
  $in = new FileHandle("$file")
      or die "ERROR: cannot open $file: $!\n" if (!$Opt_utf8);
  return $in;
}

$uamf = shift @ARGV;
$uamin = open_infile($uamf);


while (<$uamin>) {
    chomp;
    if(/<segment /){
        /start=\'([0-9]+)\'/;
        /end=\'([0-9]+)\'/;
        /features=\'([^\']+)\'/;
        $features =~ s/annotation;//;

    push(@{$Start{$start}},$features); 
    push(@{$End{$end}},$features); 
    }
}

EDITED

So after some intensive reading of the perl doc, here's somethings i've gotten

  • The /<segment / is a regex check that checks whether the readline in while (<$uamin>) contains the following string: <segment.
  • Similarly the /start=\'([0-9]+)\'/ has nothing to to do with instantiating any variable, it's a regex check to see whether the readline in while (<$uamin>) contains start=\'([0-9]+)\' which \'([0-9]+)\' refers to a numeric string.
  • In $features =~ s/annotation;// the =~ is use because the string replacement was testing a regular expression match. See What does =~ do in Perl?

Solution

  • Where did you see this syntax (or more to the point: have you edited stuff out of what you saw)? /foo/ represents the match operator using regular expressions, not variables. In other words, the first line is checking to see if the input string $_ contains the character sequence <segment.

    The subsequent three lines essentially do nothing useful, in the sense that they run regular expression matches and then discard the results (there are side-effects, but subsequent regular expressions discard the side-effects, too).

    The last line does a substitution, replacing the first occurance of the characters annotation; with the empty string in the string $features.

    Run the command perldoc perlretut to learn about regex in Perl.