Raku has an interesting and exciting recursive-regex notation: <~~>
.
So in the REPL, we can do this:
[0] > 'hellohelloworldworld' ~~ m/ helloworld /;
「helloworld」
[1] > 'hellohelloworldworld' ~~ m/ hello <~~>? world /;
「hellohelloworldworld」
Going directly from the Raku Docs for Recursive Regexes, we can capture/count various levels of nesting:
~$ raku -pe '#acts like cat here' nest_test.txt
not nested
previous blank
nestA{1}
nestB{nestA{1}2}
nestC{nestB{nestA{1}2}3}
~$ raku -ne 'my $cnt = 0; say m:g/ \{ [ <( <-[{}]>* )> | <( <-[{}]>* <~~>*? <-[{}]>* )> ] \} {++$cnt} /, "\t $cnt -levels nested";' nest_test.txt
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
(「1」) 1 -levels nested
(「nestA{1}2」) 2 -levels nested
(「nestB{nestA{1}2}3」) 3 -levels nested
(Above, change say
to put
to only return the captured string).
But I recently ran into an issue trying to solve a Unix & Linux question, which is: how to limit the recursion? Let's say we want to only capture below nestB
. Is there anyway to do this using the <~~>
recursive regex syntax?
~$ raku -ne 'my $cnt = 0; say m:g/ nestB \{ [ <( <-[{}]>* )> | <( <-[{}]>* <~~>*? <-[{}]>* )> ] \} {++$cnt} /, "\t $cnt -levels nested";' nest_test.txt
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
() 0 -levels nested
NOTE: Above I've tried to force some sort of 'frugal recursive behavior' by using <~~>*?
. The truth is <~~>
(standard recursive notation), <~~>?
, <~~>*
, and <~~>*?
all give identical results (rakudo-moar-2024.09-01
).
Using Recursive Regexes in Raku: how to limit recursion-levels?
Increment a dynamic variable inside a <?{ ...}>
conditional. For example:
my $*cnt;
say 'a' x 100 ~~ / <?{++$*cnt <= 5}> a <~~>? /; # 「aaaaa」