Search code examples
pythonpython-3.xregexreplaceregex-group

Set search pattern by setting a constraint on how a substring should not start and another on how a substring should not end


import re, datetime

input_text = "Por las mañanas de verano voy a la playa, y en la manana del 22-12-22 16:22 pm o quizas mañana en la mañana hay que estar alli y no 2022-12-22 a la manana"

today = datetime.date.today()
tomorrow = str(today + datetime.timedelta(days = 1))

input_text = re.sub(r"\b(?:las|la)\b[\s|]*(?:mañana|manana)\bs\s*\b", tomorrow, input_text)

print(repr(input_text))  # --> output

Why does the restriction that I place fail?

The objective is that there cannot be any of these options (?:las|la) , the objective is that there cannot be any of these options in front of the pattern (?:mañana|manana) , and that there cannot be behind it either a letter 's' followed by one or more spaces s\s*

This is the correct output that you should get after making the replacements in the cases where it is appropriate

"Por las mañanas de verano voy a la playa, y en la manana del 22-12-22 16:22 pm o quizas 22-12-23 en la mañana hay que estar alli y no 2022-12-22 a la manana"

Solution

  • You can use

    re.sub(r"\b(las?\s+)?ma[ñn]ana(?!s\b)", lambda x: x.group() if x.group(1) else tomorrow, input_text)
    

    The regex matches

    • \b - a word boundary
    • (las?\s+)? - an optional la or las followed with one or more whitespaces
    • ma[ñn]ana - mañana or manana
    • (?!s\b) - a negative lookahead that fails the match if there is an s letter immediately at the end of the word.

    If Group 1 matches, the replacement does not occur, if it does not match, the replacement is tomorrow.