Search code examples
pythonnlpnltktextblob

Finding the nouns in a sentence given the context in Python


How to find the nouns in a sentence regarding the context? I am using the nltk library as follows:

text = 'I bought a vintage car.'
text = nltk.word_tokenize(text)
result = nltk.pos_tag(text)
result = [i for i in result if i[1] == 'NN']

#result = [('vintage', 'NN'), ('car', 'NN')]

The problem with this script is that it considers vintage as a noun, which can be true, but given the context, it is an adjective.

How can we achieve this task?

Appendix: Using textblob, we get "vintage car" as the noun:

!python -m textblob.download_corpora
from textblob import TextBlob
txt = "I bought a vintage car."
blob = TextBlob(txt)
print(blob.noun_phrases) #['vintage car']

Solution

  • Using spacy might solve your task. Try this:

    import spacy
    nlp = spacy.load("en_core_web_lg")
    
    def analyze(text):
        doc = nlp(text)
        for token in doc:
            print(token.text, token.pos_)
    
    analyze("I bought a vintage car.")
    print()
    analyze("This old wine is a vintage.")
    

    Output

    I PRON
    bought VERB
    a DET
    vintage ADJ <- correctly identified as adjective
    car NOUN
    . PUNCT
    
    This DET
    old ADJ
    wine NOUN
    is AUX
    a DET
    vintage NOUN  <- correctly identified as noun
    . PUNCT