I am trying to replace every instance of Um
with #Um
. For example, "Um, i have an Umbrella" would be "#Um, i have an Umbrella" where I want Umbrella
just as is because it's not just Um
.
Below is my code.
while re.search(r'\bUm\b', trans):
trans = re.sub(r'\bUm\b', r'#Um', trans)
And my code does not stop running. It loops indefinitely. Is there any other way to approach this problem?
After you do the replacement, the string still matches the regexp you're testing, because there's a word boundary between #
and U
in #Um
. A word boundary is any place where there's a word character on one side and a non-word character on the other.
So after you do the first replacement the string is
#Um, i have an Umbrella
The next iteration changes it to
##Um, i have an Umbrella
and it keeps adding #
over and over.
Since re.sub()
replaces all matches in the string (unless you use the optional argument that limits it), there's no need to do the same replacement in a loop. Get rid of the while
statement and just use the call to re.sub()
once.