The Python Regular Expression HOWTO explains how the syntax of non-capturing and named groups came about:
For these new features the Perl developers couldn’t choose new single-keystroke metacharacters or new special sequences beginning with
\
without making Perl’s regular expressions confusingly different from standard REs. If they chose&
as a new metacharacter, for example, old expressions would be assuming that&
was a regular character and wouldn’t have escaped it by writing\&
or[&]
.The solution chosen by the Perl developers was to use
(?...)
as the extension syntax.?
immediately after a parenthesis was a syntax error because the?
would have nothing to repeat, so this didn’t introduce any compatibility problems.
I don't understand why the parenthesis should have something that repeats? I do understand the overall point that taking something that caused a syntax error, to use to extend regex functionality, would prevent existing regexes from breaking.
regular-expressions.info explains it nicely.
The ... question mark is the quantifier that makes the previous token optional. This quantifier cannot appear after an opening parenthesis, because there is nothing to be made optional at the start of a group. Therefore, there is no ambiguity between the question mark as an operator to make a token optional and the question mark as part of the syntax for non-capturing groups...
Nothing to be made optional, because the token before the ?
is the group-opening metacharacter (
, and not something searchable in a string such as a regular character.
I don't agree with the phrasing in the HOWTO you linked to. Optional – i.e., zero or one times – is not "repeating".