Search code examples
regexrakulookbehindrakudo

Raku regex: How to use capturing group inside lookbehinds


How can I use capturing groups inside lookbehind assertions?

I tried to use the same formula as in this answer. But that does not seem to work with lookbehinds.

Conceptually, this is what I was trying to do.

say "133" ~~ m/ <?after $0+> (\d) $ /

I know this can be easily achieved without lookbehinds, but ignore that just for now :)

For this I tried with these options:

Use :var syntax:

say "133" ~~ m/ <?after $look-behind+> (\d):my $look-behind; $ /;
# Variable '$look-behind' is not declared

Use code block syntax defining the variable outside:

my $look-behind;
say "133" ~~ m/ <?after $look-behind+> (\d) {$look-behind=$0} $ /;
# False

It seems that the problem is that the lookbehind is executed before the "code block/:my $var", and thus the variable is empty for the lookbehind tree.

Is there a way to use capturing groups inside lookbehinds?


Solution

  • When you reference a captured value before it is actually captured, it is not initialized, hence you can't get a match. You need to define the capturing group before actually using the backreference to the captured value.

    Next, you need to define a code block and assign the backreference to a variable to be used throughout the regex pattern, else, it is not visible to the lookbehind pattern. See this Capturing Raku reference:

    This code block publishes the capture inside the regex, so that it can be assigned to other variables or used for subsequent matches

    You can use something like

    say "133" ~~ m/ (\d) {} :my $c=$0; <?after $c ** 2> $ /;
    

    Here, (\d) matches and captures a digit, then a code block is used to assign this captured value to a $c variable, and then the <?after $c ** 2> lookbehind checks if the $c value appears at least twice immediately to the left of the current location, and then the $ anchor checks if the current position is the end of the string.

    See this online Raku demo.