Search code examples
pythonstringstring-matching

exact and case insensitive match for a multi word token in a string python


I have a list which contains and a single and multi-word token.

brand_list = ['ibm','microsoft','abby softwate', 'tata computer services']

I need to search any of these words present in a title string. I am able to find a single word. But for a multi-word token, my code fails. Here is my code. Please help me out. Here is my solution.

import string
def check_firm(test_title):
    translator = str.maketrans('', '', string.punctuation)
    title = test_title.translate(translator)
    if any(one_word.lower() in title.lower().split(' ') for one_word in brand_list):

        status_code_value = 0
        print("OEM word found")
    else:
        status_code_value = 1
        print("OEM word not found")

    print("current value of status code ------------>", status_code_value)

Solution

  • Change this:

    if any(one_word.lower() in title.lower().split(' ') for one_word in brand_list):
    

    to this:

    if title.lower() in brand_list:
    

    Hence:

    import string
    brand_list = ['ibm','Microsoft','abby softwate', 'TATA computer services']
    brand_list = [x.lower() for x in brand_list] # ['ibm', 'microsoft', 'abby softwate', 
                                                 #  'tata computer services']
    
    def check_firm(test_title):
        translator = str.maketrans('', '', string.punctuation)
        title = test_title.translate(translator)
    
        if title.lower() in brand_list:
            status_code_value = 0
            print("OEM word found")
        else:
            status_code_value = 1
            print("OEM word not found")
    
        print("current value of status code ------------>", status_code_value)
    
    check_firm('iBM')
    check_firm('Tata Computer SERVICES')
    check_firm('Khan trading Co.')
    

    OUTPUT:

    OEM word found
    current value of status code ------------> 0
    OEM word found
    current value of status code ------------> 0
    OEM word not found
    current value of status code ------------> 1
    

    Note: I converted all the elements in the list to lower() using:

     brand_list = [x.lower() for x in brand_list]
    

    This will ensure the comparison is made correctly.

    EDIT:

    OP: but my input tile is title string. for example "Tata Computer SERVICES made a profit of x dollars". In that case, how can we find the string?

    In such case, I would opt for splitting the string before passing to the function:

    inp_st1 = 'iBM'
    inp_st2 = 'Tata Computer SERVICES made a profit of x dollars'
    inp_st3 = 'Khan trading Co.'
    
    check_firm(inp_st1)
    check_firm(" ".join(inp_st2.split()[:3])) # Tata Computer SERVICES
    check_firm(inp_st3)