Search code examples
pythonregexstringwhitespacebrackets

Is there an easy way to remove unnecessary whitespaces inside of brackets that are in the middle of a string in Python?


I've strings in the form of:

s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors."

and I would like to get a cleaned string in the form of:

s = "Wow that is really nice, (2.1) shows that according to the drawings in (1.1) and a) there are errors."

I tried to fix it with regex:

import re

regex = r" (?=[^(]*\))"
s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are some errors."
re.sub(regex, "", s)

But I get faulty results like this: Wow that is really nice, (2.1) shows that according to the drawings in (1.1)anda) there are some errors.

Does anyone know how to deal with this problem when you don't always have the same number of opening and closing brackets?


Solution

  • You can match all the inner-most parentheneses with simple regex, and then perform a substitution on the matches to remove all the whitespaces.

    import re
    s = "Wow that is really nice, ( 2.1 ) shows that according to the drawings in ( 1. 1) and a) there are errors."
    regex = r"\([^\(\)]*\)"
    res = re.sub(regex, lambda s: s[0].replace(" ", ""), s)
    
    print(res)