Search code examples
pythonregexstring-matching

How to match entire words dynamically using Regex in Python


Using Regex, I want to match a sequence of words entirely in Python. Statically it is possible, but I am unaware about the dynamic way of matching.

Static method

import re
print(re.search(r'\bsmaller than or equal\b', 'When the loan amount is smaller than or equal to 50000'))

I am trying to do the same thing dynamically, by matching the entire sequence with a list.
Here is the code snippet below:

import re
list_less_than_or_equal = ['less than or equal', 'lesser than or equal', 'lower than or equal', 'smaller than or equal','less than or equals', 'lesser than or equals', 'lower than or equals', 'smaller than or equals', 'less than equal', 'lesser than equal', 'higher than equal','less than equals', 'lesser than equals', 'higher than equals']

for word in list_less_than_or_equal:
    print(re.search(r'\b'+word+'\b', 'When the loan amount is smaller than or equal to 50000'))

It prints None as the output.

How to match the entire sequence of words dynamically?


Solution

  • You forgot an r in your second '\b'.

    re.search(r'\b' + re.escape(word) + r'\b', ...)
    #                                   ^
    

    The escape sequence \b has special meaning in Python and will become \x08 (U+0008). The regex engine seeing \x08 will try to match this literal character and fail.

    Also, I used re.escape(word) to escape special regex characters, so e.g. if a word is "etc. and more" the dot will be matched literally, instead of matching any character.