How can I match r'\a' in Python using lookbehind assertion?
Actually, I need to match C++ strings like "a \" b"
and
"str begin \
end"
I tried:
>>> res = re.compile('(?<=\)a')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/re.py", line 190, in compile
return _compile(pattern, flags)
File "/usr/lib/python2.7/re.py", line 244, in _compile
raise error, v # invalid expression
>>> res = re.compile('(?<=\\)a')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/re.py", line 190, in compile
return _compile(pattern, flags)
File "/usr/lib/python2.7/re.py", line 244, in _compile
raise error, v # invalid expression
sre_constants.error: unbalanced parenthesis
>>> res = re.compile('(?<=\\\)a')
>>> ms = res.match(r'\a')
>>> ms is None
True
Real Example:
When I'm parcing "my s\"tr"; 5;
like ms = res.match(r'"my s\"tr"; 5;')
, the expected output is: "my s\"tr"
Answer
Finally stribizhev provided the solution. I thought my initial regex is less computationally expensive and the only issue was that it should be declared using a raw string:
>>> res = re.compile(r'"([^\n"]|(?<=\\)["\n])*"', re.UNICODE)
>>> ms = res.match(r'"my s\"tr"; 5;')
>>> print ms.group()
"my s\"tr"
EDIT: The final regex is an adaptation from the regex provided at Word Aligned
I think you are looking for this regex:
(?s)"(?:[^"\\]|\\.)*"
See demo on regex101.
Sample Python code (tested on TutorialsPoint):
import re
p = re.compile(ur'(?s)"(?:[^"\\]|\\.)*"')
ms = p.match('"my s\\"tr"; 5;')
print ms.group(0)