How extract the current sentence and surrounding sentences around a particular word with Python?

Is there a way to get the surrounding sentences around any selected word in the sentence. Let's say our goal is to get the current sentence that contains the word "Champion" in your example below as well as the previous and next sentences that surround it regardless of their position, tag or how many times the word champion is repeated.

text = "This is sentence 1. We are the champions. This is sentence 3. This is sentence 4. This is sentence 5. You are champions too."

In example above the word champion is repeated in sentence 2 and 6. So we want to get sent 1,2,3,5,6 and exclude sent 4.

How can we achieve this with Spacy or other tools?

Solution

Using this function will give the surrounding sentences.

from nltk.tokenize import sent_tokenize
from nltk.tokenize import word_tokenize

def surrounding_sentences(text, word):

    sentences=sent_tokenize(text)
    
    my_sents=[]
    for i in range(len(sentences)):
        if word in word_tokenize(sentences[i].lower()): 
            if i-1>0 : 
                previous_sent = sentences[i-1]
                my_sents.append(previous_sent)
            else: pass
            sent= sentences[i]
            my_sents.append(sent)
            if i+1 < len(sentences):
                nextsent = sentences[i+1]
                my_sents.append(nextsent)
            else: pass
    my_sents = list(set(my_sents))
    return my_sents