Search code examples
regexregex-lookaroundsrakurakudo

Lookaround regex and character consumption


Based on the documentation for Raku's lookaround assertions, I read the regex / <?[abc]> <alpha> / as saying "starting from the left, match but do not not consume one character that is a, b, or c and, once you have found a match, match and consume one alphabetic character."

Thus, this output makes sense:

'abc' ~~ / <?[abc]> <alpha> /     # OUTPUT: «「a」␤ alpha => 「a」»

Even though that regex has two one-character terms, one of them does not capture so our total capture is only one character long.

But next expression confuses me:

'abc' ~~ / <?[abc\s]> <alpha> /     # OUTPUT: «「ab」␤ alpha => 「b」»

Now, our total capture is two characters long, and one of those isn't captured by <alpha>. So is the lookaround capturing something after all? Or am I misunderstanding something else about how the lookaround works?


Solution

  • <?[ ]> and <![ ]> does not seem to support some backslashed character classes. \n, \s, \d and \w show similar results.

    <?[abc\s]> behaves the same as <[abc\s]> when \n, \s, \d or \w is added.

    \t, \h, \v, \c[NAME] and \x61 seem to work as normal.