import re
input_str = "esta a mil o a 25 mil 500 millas de la ciudad de milan" #Example 1
input_str = "mil o mas para pasar! y tu tienes menos de mil" #Example 2
input_str = "hace casi mil o 2mil anos aparecio un fenomeno legendario tan solo a millas de aqui" #Example 3
How do I make it do the replacement if the word "mil"
is NOT preceded by a number \d
input_str = re.sub(r"\d[\s|]*(?:mil)", " 1 mil" , input_str)
Output for this 3 examples that I need:
"esta a 1 mil o a 25 mil 500 millas de la ciudad de milan" # for Example 1
"1 mil o mas para pasar! y tu tienes menos de 1 mil" # for Example 2
"hace casi 1 mil o 2mil anos aparecio un fenomeno legendario tan solo a millas de aqui" # for Example 3
If I understand correctly, the regex would be:
Meaning that you want to capture mil
if the character before it is not a number, \d
, or a number then a space, \d\s
. I'm using two negative lookbehinds since python doesn't support a variable length lookbehind.