Search code examples
pythonregexpython-re

Match characters between square brackets but only if text inside brackets follows pattern


I want to match text inside of square brackets - but ONLY if it contains hashtag+digit+digit

i.e [#18] or [hello #25 bye]

NOT [25] (no hashtag)

I ultimately want to remove these match strings (including the brackets & ALL the text inside the brackets).

For example

12345 one two [#13 west] words [2025/02/25] #15 [#88]turtles [smth #25 else].

I would like it changed to:

12345 one two  words [2025/02/25] #15 turtles .

There will not be any brackets inside the (filter out) brackets, only (hashtag, A-z, 0-9 and space). I got it working for the brackets and hashtag+digit+digit, but not the optional words.


Solution

  • You can replace \[[^]]*#\d{2}\b[^]]*\] with an empty string:

    import re
    
    print(re.sub(r"(?m)\[[^]]*#\d{2}\b[^]]*\]", '',
          "12345 one two [#13 west] words [2025/02/25] #15 [#88]turtles [smth #25 else] "))
    

    Results

    12345 one two  words [2025/02/25] #15 turtles 
    
    • In this regex, #\d{2}\b is to make sure that we have at least one # followed by two digits.

    See the details here