I have one text file with a list of phrases. Below is how the file looks:
Filename: KP.txt
And from the below input (paragraph), I want to extract the next 2 words after the KP.txt
phrase (the phrases could be anything as shown in my above KP.txt
file). All I need is to extract the next 2 words.
Input:
This is Lee. Thanks for contacting me. I wanted to know the exchange policy at Noriaqer hardware services.
In the above example, I found the phrase " I wanted to know"
, matches with the KP.txt
file content. So if I wanted to extract the next 2 words after this, my output will be like "exchange policy"
.
How could I extract this in python?
Assuming you already know how to read the input file into a list, it can be done with some help from regex.
>>> wordlist = ['I would like to understand', 'I wanted to know', 'I wish to know', 'I am interested to know']
>>> input_text = 'This is Lee. Thanks for contacting me. I wanted to know exchange policy at Noriaqer hardware services.'
>>> def word_extraction (input_text, wordlist):
... for word in wordlist:
... if word in input_text:
... output = re.search (r'(?<=%s)(.\w*){2}' % word, input_text)
... print (output.group ().lstrip ())
>>> word_extraction(input_text, wordlist)
exchange policy
>>> input_text = 'This is Lee. Thanks for contacting me. I wish to know where is Noriaqer hardware.'
>>> word_extraction(input_text, wordlist)
where is
>>> input_text = 'This is Lee. Thanks for contacting me. I\'d like to know where is Noriaqer hardware.'
>>> word_extraction(input_text, wordlist)
>>>
Regex Hint: