Search code examples
regexpython-3.xlookbehind

RegEx look-behind '(?<=\n|\A)' does not work because of variable length


I use Python (3) and need a regular expression that matches at the beginning of the string or right after a newline.

I must add the re.DOTALL flag though because I need to process multiple lines at once. The example here is just simplified.

What I came up with is this lookbehind:

(?<=\n|\A)start of line

I tested it on regex101.com where it works, but running it in my Python 3.5 console leads to this error traceback:

$ python3
Python 3.5.1+ (default, Mar 30 2016, 22:46:26) 
[GCC 5.3.1 20160330] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> re.search(r'(?<=\n|\A)start of line', 'just any text to test', re.DOTALL)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.5/re.py", line 173, in search
    return _compile(pattern, flags).search(string)
  File "/usr/lib/python3.5/re.py", line 293, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/usr/lib/python3.5/sre_compile.py", line 540, in compile
    code = _code(p, flags)
  File "/usr/lib/python3.5/sre_compile.py", line 525, in _code
    _compile(code, p.data, flags)
  File "/usr/lib/python3.5/sre_compile.py", line 158, in _compile
    raise error("look-behind requires fixed-width pattern")
sre_constants.error: look-behind requires fixed-width pattern
>>> 

What can I use instead to overcome this limitation?


Solution

  • Since \A is not a character, the error message makes sense.

    Try this instead

    re.search(r'^start of line', 'just any text to test', re.MULTILINE)
    

    DOTALL is only relevant when you use . in your regular expression.

    Maybe regex101 uses the third party regex package instead of re from the standard library.

    >>> import regex
    >>> regex.search(r'(?<=\n|\A)line', 'test\nline')
    <regex.Match object; span=(5, 9), match='line'>
    

    As you can see, regex accepts variable width lookbehind patterns.