Search code examples
regexperl

Perl's capture group disappears while in scope


I have very simple code that parses a file name:

#!/usr/bin/env perl

use 5.040;
use warnings FATAL => 'all';
use autodie ':default';

my $string = '/home/con/bio.data/blastdb/phytophthora.infestans.KR_2_A2/GCA_012552325.1.protein.faa';

if ($string =~ m/blastdb\/(\w)\w+\.([\w\.]+)/) {
    my $rest = $2; # $1 would be valid here
    $rest =~ s/\./ /g;
    my $name = "$1.$rest"; # $1 disappears here
}

the above code fails with Use of uninitialized value $1 in concatenation (.) or string

However, if I save $1 into a variable, e.g. $g, the information isn't lost.

if ($string =~ m/blastdb\/(\w)\w+\.([\w\.]+)/) {
    my ($g, $rest) = ($1, $2);
    $rest =~ s/\./ /g;
    my $name = "$g.$rest";
}

So I can fix this.

However, $1 shouldn't just disappear like that, shouldn't $1 remain valid while in scope? Is this a bug in Perl? or is there some rule in https://perldoc.perl.org/perlretut that I missed?


Solution

  • $rest =~ s/\./ /g; does a regex match. The pattern it matches (/\./) doesn't have any capturing groups, therefore all of the capture variables are uninitialized after it completes.

    You can save what you need in variables — most simply, by doing if (my ($g, $rest) = $string =~ /yadda yadda/) or you can avoid doing other regex matches before you're done with the captures from the previous one — in this case, $rest =~ tr/./ / would do the job just as well as $rest =~ s/\./ /g, but without clobbering the capture variables.