Search code examples
pythonregexpython-3.xio

How to search and replace with enclosed characters a text file?


Given a textile how can I replace all the tokens that have % at the beginning for []. For instance in the following text file:

Hi how are you? 
I %am %fine.
Thanks %and %you

How can I enclose all the characters with % with []:

Hi how are you? 
I [am] [fine].
Thanks [and] [you]

I tried to first filter the tokens and then replace them but maybe there is a more pythonic way:

with open('../file') as f:
    s = str(f.readlines())
    a_list = re.sub(r'(?<=\W)[$]\S*', s.replace('.',''))
    a_list= set(a_list)
    print(list(a_list))

Solution

  • You may use

    re.sub(r'\B%(\w+)', r'[\1]', s)
    

    See the regex demo

    Details

    • \B - a non-word boundary, there must be start of string or a non-word char immediately to the left of the current location
    • % - a % char
    • (\w+) - Group 1: any 1 or more word chars (letters, digits or _). Replace with (\S+) to match 1 or more non-whitespace chars if necessary, but note \S also matches punctuation.

    Python demo:

    import re
    
    s = "Hi how are you? \nI %am %fine.\nThanks %and %you"
    result = re.sub(r"\B%(\w+)", r"[\1]", s)
    print(result)