How to find the nouns in a sentence regarding the context? I am using the nltk
library as follows:
text = 'I bought a vintage car.'
text = nltk.word_tokenize(text)
result = nltk.pos_tag(text)
result = [i for i in result if i[1] == 'NN']
#result = [('vintage', 'NN'), ('car', 'NN')]
The problem with this script is that it considers vintage
as a noun, which can be true, but given the context, it is an adjective.
How can we achieve this task?
Appendix: Using textblob
, we get "vintage car" as the noun:
!python -m textblob.download_corpora
from textblob import TextBlob
txt = "I bought a vintage car."
blob = TextBlob(txt)
print(blob.noun_phrases) #['vintage car']
Using spacy might solve your task. Try this:
import spacy
nlp = spacy.load("en_core_web_lg")
def analyze(text):
doc = nlp(text)
for token in doc:
print(token.text, token.pos_)
analyze("I bought a vintage car.")
print()
analyze("This old wine is a vintage.")
Output
I PRON
bought VERB
a DET
vintage ADJ <- correctly identified as adjective
car NOUN
. PUNCT
This DET
old ADJ
wine NOUN
is AUX
a DET
vintage NOUN <- correctly identified as noun
. PUNCT