I try to research if a word exists in a string or not. the problem that the search word contains the character ':'
. the search was not successful even if I used the escape.
In the example the search for the word 'decision :'
return does not exist while the word does exist in the sentence.
Knowing that the search must be exact example: I search the word 'for'
it must return me not exist when the sentence contains the word 'formatted'
.
import re
texte =" hello \n a formated test text \n decision : repair \n toto \n titi"
word_list = ['decision :', 'for']
def verif_exist (word_list, paragraph):
exist = False
for word in word_list:
exp = re.escape(word)
print(exp)
if re.search(r"\b%s\b" % exp, paragraph, re.IGNORECASE):
print("From exist, word detected: " + word)
exist = True
if exist == True:
break
return exist
if verif_exist(word_list, texte):
print("exist")
else:
print("not exist") ```
The documentation states: "\b matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of word characters.". There is no word boundary between : and a space because both are not part of a sequence of word characters.
Maybe you can use either a word boundary or a whitespace in your regular expression.
import re
texte = " hello \n a formated test text \n decision : repair \n toto \n titi"
word_list = ['decision :', 'for']
def verif_exist(word_list, paragraph):
for word in word_list:
exp = re.escape(word)
print(exp)
if re.search(fr"\b{exp}(\b|\s)", paragraph, re.IGNORECASE):
print("From exist, word detected: " + word)
return True
return False
if verif_exist(word_list, texte):
print("exist")
else:
print("not exist")
That's still not perfect. You might want to take into account what happens if your text ist just 'decision :'
. Here we don't have a word boundary and we don't have a whitespace. We'll have to add a check for the end of the text giving us:
if re.search(fr"\b{exp}(\b|\s|$)", paragraph, re.IGNORECASE):
And now you might have to do something similar to the word boundary at the beginning of your regular expression.