Search code examples
regexrakuratchet

Does ratcheting affect nesting structures and "frugal quatifiers"?


In Perl 6 one can use the tilde operator for nesting structures. Apparently ratcheting affects how the nesting structure works.

This case doesn't use ratcheting:

$ perl6 -e "say '{hello} aaa }' ~~ / '{' ~ '}' ( .+? ) /"
「{hello}」
 0 => 「hello」

while this does:

$ perl6 -e"say '{hello} aaa }' ~~ / :r '{' ~ '}' ( .+? ) /"
Nil

I can have the result I expect by changing the .+? pattern into the more specific <-[}]> +:

$ perl6 -e"say '{hello} aaa }' ~~ / :r '{' ~ '}' ( <-[}]> + ) /"
「{hello}」
 0 => 「hello」

but I don't know why the "frugal quantifier" doesn't work using ratcheting. Any idea?

(using rakudo 2019.03.1)


Solution

  • The :ratchet regex adverb forbids the engine to backtrack into the quantified subpattern.

    The first / :r '{' ~ '}' ( .+? ) / pattern means that .+? pattern, after it matches any 1 or more chars, as few as possible, won't be re-tested, re-entered upon the subsequent pattern failure.

    Here, in your {hello} aaa } example, after testing {, the .+? matches h, and then } fails to match e. Since no backtracking is allowed the match is failed and the next iteration starts: h is tested for {, and fails, etc.

    The second regex with <-[}]> + works because this matches any 1+ chars other than }, and that is the crucial difference from .+? that could match } and obligatorily consumed at least 1 char (due to +). Thus, it can't consume } and finds a match.