Search code examples
pythonregexsubstitutionpunctuation

Fetch the substituted word matched with regex python


Suppose we have a string: "This is an example.It does not contain space after one sentence." And a matching pattern: "(\.|,|:|;|!|\)|\])(\s*)([a-zA-Z]*)" This pattern matches for any combination where, after punctuation, no space or more than one spaces available. If any of these conditions matches, it substitutes a single space with \1 \3. Output of this will be: This is an example. It does not contain space after one sentence. (substituted with space)

My question is: As we know that .It is our matched string and its index position. But how we can fetch what exactly substituted at its position? I want to fetch that . It (dot space It).

Note: Please also consider with the case of multiple matches in a single line.

Edit:

Input: This is text.Another text.Next case

Output: [". Another",". Next"]


Solution

  • Please use below regex

    .*?(\.)\s*(\w*)\s
    

    Code

    import re
    a="This is text.Another text.Next case"
    print([i+" "+j for (i,j) in re.findall(".*?(\.)\s*(\w*)\s",a)])
    

    Output

    ['. Another', '. Next']