I am trying to get all occurences of a group of patterns in an arbitrary string, much like this:
my $STRING = "I have a blue cat. That cat is nice, but also quite old. She is always bored.";
foreach (my @STOPS = $STRING =~ m/(?<FINAL_WORD>\w+)\.\s*(?<FIRST_WORD>\w+)/g ) {
print Dumper \%+, \@STOPS;
}
But the outcome is not what I expected, and I don't fully understand why:
$VAR1 = {
'FINAL_WORD' => 'old',
'FIRST_WORD' => 'She'
};
$VAR2 = [
'cat',
'That',
'old',
'She'
];
$VAR1 = {
'FINAL_WORD' => 'old',
'FIRST_WORD' => 'She'
};
$VAR2 = [
'cat',
'That',
'old',
'She'
];
$VAR1 = {
'FINAL_WORD' => 'old',
'FIRST_WORD' => 'She'
};
$VAR2 = [
'cat',
'That',
'old',
'She'
];
$VAR1 = {
'FINAL_WORD' => 'old',
'FIRST_WORD' => 'She'
};
$VAR2 = [
'cat',
'That',
'old',
'She'
];
If there is no better solution I could live with what is in @STOPS
in the end and omit the loop. But I would prefer to get every pair of matches separately, and I don't see a way.
But why then is the loop executed multiple times anyway?
Thank you in advance, and Regards,
Mazze
You need to use a while
loop not a for
loop:
while ($STRING =~ m/(?<FINAL_WORD>\w+)\.\s*(?<FIRST_WORD>\w+)/g ) {
print Dumper \%+;
}
Output:
$VAR1 = {
'FIRST_WORD' => 'That',
'FINAL_WORD' => 'cat'
};
$VAR1 = {
'FIRST_WORD' => 'She',
'FINAL_WORD' => 'old'
};
The for
loop gathers all the matches at once in @STOPS
and %+
is set to the last global match. The while
loop allows you to iterate through each global match separately.
According to perldoc perlretut:
The modifier
/g
stands for global matching and allows the matching operator to match within a string as many times as possible. In scalar context, successive invocations against a string will have/g
jump from match to match, keeping track of position in the string as it goes along. You can get or set the position with thepos()
function.