Search code examples
pythonregexnsregularexpression

Searching for a whole word that contains leading or trailing special characters like - and = using regex in python


I am trying to know a position of a string (word) in a sentence. I am using the function below. This function is working perfectly for most of the words but for this string GLC-SX-MM= in the sentence I have a lot of GLC-SX-MM= in my inventory list there is no way to get the match. I tryied scaping - and = but not works. Any idea? I cannot split the sentence using spaces because sometimes I have composed words separated by space.

import re 

def get_start_end(self, sentence, key):
        r = re.compile(r'\b(%s)\b' % key, re.I)
        m = r.search(question)
        start = m.start()
        end = m.end()
        return start, end

Solution

  • You need to escape the key when looking for a literal string, and make sure to use unambiguous (?<!\w) and (?!\w) boundaries:

    import re 
    
    def get_start_end(self, sentence, key):
        r = re.compile(r'(?<!\w){}(?!\w)'.format(re.escape(key)), re.I)
        m = r.search(question)
        start = m.start()
        end = m.end()
        return start, end
    

    The r'(?<!\w){}(?!\w)'.format(re.escape(key)) will build a regex like (?<!\w)abc\.def\=(?!\w) out of abc.def= keyword, and (?<!\w) will fail any match if there is a word char immediately to the left of the keyword and (?!\w) will fail any match if there is a word char immediately to the right of the keyword.