Search code examples
pythonnlpnltkstanford-nlpwordnet

Finding the best preposition for a verb


I have the task of sentence completion, I have the subj, verb, adverb or subject and all I need is the appropriate preposition in between. Is there any NLP tool that can give distribution over the prepositions that can go with the verb?

Best


Solution

  • Here's how to get frequency counts for all verb-preposition pairs in the Brown corpus, and then look up the ones for the verb "go". First the counts:

    import nltk
    from nltk.corpus import brown
    prepchoices = nltk.ConditionalFreqDist((v[0], p[0]) 
        for (v, p) in nltk.bigrams(brown.tagged_words(tagset="universal")) 
            if v[1] == "VERB" and p[1] == "ADP") 
    

    "ADP" stands for "adposition", i.e. preposition or post-position. Now let's look at what we've got:

    >>> prepchoices["go"]
    FreqDist({'to': 96, 'with': 20, 'into': 18, 'through': 8, 'on': 8, 'for': 7, 
    'in': 5, 'out': 4, 'around': 4, 'from': 4, ...})
    

    You can get the top choices, in descending order of frequency, with most_common():

    >>> print(prepchoices["go"].most_common(5))
    [('to', 96), ('with', 20), ('into', 18), ('through', 8), ('on', 8)]
    

    I didn't do any stemming of the verbs ("goes" and "went" were counted as separate words), or even case-folding. You could add them, but the above should already give you a decent picture of the distribution.