Search code examples
pythonexcelpandasnlp

I want to transform this code to work wit full sentences


Iam trying to check if keyword occurs in the sentence and then add the said keyword. I managed to write this solution but it only works if the search term is one word (said keyword). How to improve it to work when keyword occurs in a sentence? Here is my code:

keyword = []
for i in keywords['keyword']:
    keyword.append(i) #this was in a dataframe after readin xlsx file with Pandas so I made it a list

hit = []
for i in phrase['Search term']:
    if i in keyword:
        hit.append(i)
    else:
        hit.append("blank")

phrase['Keyword'] = hit 

This only works when a single keyword occurs in "Phrase" - like "cat" but won't work if the word "cat" is part of a sentence. Any pointers to improve it ? Thank you all in advance


Solution

  • I am not sure what you are trying to achieve here. However, I'm going to point an issue that might help you.

    In your comment you said that keyword is a list of words and phrase['Search term'] is a list of sentences.

    for i in phrase['Search term']:
        if i in keyword:
            hit.append(i)
        ...
    

    In this part of your code you are checking if a entire sentence i can be found in any of the single words in keyword. That logic is flawed, you need to check if a word exists in the sentence, not the other way around.

    Something like this:

    for i in phrase['Search term']:
        for j in keyword:
            if j in i:
                hit.append(i)
            ...
    

    This is an example you will need to adjust to your purpose, since now it will check word for word.

    The code above may lead to undesirable behavior since it checks if a smaller string(word) exists inside a larger string(sentence). It doesn't really check for words. For example, if looking for cat in a sentence like:

    this patient is catatonic

    Will trigger your if statement as True. A way to minimize this is spliting your sentence in a list of words and checking if the word is found inside the list. Like this:

    for i in phrase['Search term']:
        for j in keyword:
            if j in i.split(" "):
                hit.append(i)
            ...