i have a strange behavior of re.search on a binary file. Here is my python screenshot :
As you can see, i have two problems :
Any idea?
\x5b
is the ASCII [
character, the left square bracket. That's a regex meta character forming the start of a [...]
character class specification and needs to be escaped if you want to match a literal [
character:
>>> import re
>>> re.search('[', '')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mjpieters/Development/venvs/stackoverflow-2.7/lib/python2.7/re.py", line 146, in search
return _compile(pattern, flags).search(string)
File "/Users/mjpieters/Development/venvs/stackoverflow-2.7/lib/python2.7/re.py", line 251, in _compile
raise error, v # invalid expression
sre_constants.error: unexpected end of regular expression
>>> re.search('\[', '')
The same applies to \x41
, that's the ^
character, which in a regex context matches the start of the string only, not the literal character ^
. Since you tried to match data before the ^
point the regex can't
match anything, simply because that makes the anchor invalid.
If you are only searching for literal text matches, don't use a regex. You could just use str.find()
or str.index()
to get the index of matched text.
If you are using this in a larger expression and generate the expression from data, then use re.escape()
to ensure all metacharacters are properly escaped first.