So, I was coding discord bot where I used on_message() to check if message contains banned word. However, it gave similar error during its execution so I tried to create a separate file to test the error. Here's the code that did similar work in discord bot.
from re import search
from re import IGNORECASE
banned_words = (r"N[a-zA-Z0-9]gga")
for banned_word in banned_words:
if search(banned_word, input("> "), IGNORECASE):
print("N-word detected")
Here's the test
root@kali:~# python3 test.py
> nigga
N-word detected
> nigga
Traceback (most recent call last):
File "/root/test.py", line 5, in <module>
if search(banned_word, input("> "), IGNORECASE):
File "/usr/lib/python3.9/re.py", line 201, in search
return _compile(pattern, flags).search(string)
File "/usr/lib/python3.9/re.py", line 304, in _compile
p = sre_compile.compile(pattern, flags)
File "/usr/lib/python3.9/sre_compile.py", line 764, in compile
p = sre_parse.parse(p, flags)
File "/usr/lib/python3.9/sre_parse.py", line 948, in parse
p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
File "/usr/lib/python3.9/sre_parse.py", line 443, in _parse_sub
itemsappend(_parse(source, state, verbose, nested + 1,
File "/usr/lib/python3.9/sre_parse.py", line 549, in _parse
raise source.error("unterminated character set",
re.error: unterminated character set at position 0
What could go wrong here? Also it shouldn't loop more than once right? I wonder how it asked input() twice.
banned_words
is the string N[a-zA-Z0-9]gga
, so for banned_word in banned_words:
iterates over the characters.
The first value of banned_word
is the string N
. The search for this succeeds.
The second value of banned_word
is the string [
. This is not a valid regexp by itself, it's the start of a [...]
character set. So you get an error.
If banned_words
is supposed to be a tuple, you need a comma:
banned_words = (r"N[a-zA-Z0-9]gga",)
But if you want to test multiple regular expressions you can simply put them all in a single regexp with |
alternation:
banned_words = r"N[a-z0-9]gga|F[a-z0-9]+ck"
and then just do a single search rather than looping.