Search code examples
regexpython-re

re.VERBOSE with embedded space fails to compile


import re
re.compile(r"(?: *)", flags=re.VERBOSE)

raises re.error: nothing to repeat at position 4. Any of the following compile fine:

re.compile(r"(?:\ *)", flags=re.VERBOSE)
re.compile(r"(?:[ ]*)", flags=re.VERBOSE)
re.compile(r"(?: *)")

The docs for VERBOSE say "Whitespace within the pattern is ignored, except ... within tokens like *?, (?: or (?P<...>." but it seems not to be honoring the "(?:" part. Is this a library bug, or am I just not getting my head around what the docs mean?

I can reproduce this on either:

  • Python 3.9.13 / MacOS 12.6.1 (Monterey)
  • Python 3.9.2 / Debian 11.5 (bullseye)

Solution

  • There are two important details of the documentation:

    The "ignore, except" does not necessarily mean a correct pattern. It means the whitespace is taken into account.

    Moreover, the wording is "within tokens". As the documentation also says (immediately after your citation): "For example, (? : and * ? are not allowed." Notice the whitespace appears "inside" the tokens, not afterwards.

    If you run re.compile(r"(?:*)", flags=re.VERBOSE) (i.e. whitespace removed), you will get the same error (though now at position 3).