I want to use NLTK and wordnet to understand the semantic relation between two words. Like if I enter "employee" and "waiter", it returns something showing that employee is more general than waiter. Or for "employee" and "worker", it returns equal. Does anyone know how to do that?
Firstly, you have to tackle the problem of getting words into lemmas and then into Synsets, i.e. how can you identify a synset from a word?
word => lemma => lemma.pos.sense => synset
Waiters => waiter => 'waiter.n.01' => wn.Synset('waiter.n.01')
So let's say you have already deal with the above problem and arrived at the right most representation of waiter
, then you can continue to compare synsets. Do note that, a word can have many synsets
from nltk.corpus import wordnet as wn
waiter = wn.Synset('waiter.n.01')
employee = wn.Synset('employee.n.01')
all_hyponyms_of_waiter = list(set([w.replace("_"," ") for s in waiter.closure(lambda s:s.hyponyms()) for w in s.lemma_names]))
all_hyponyms_of_employee = list(set([w.replace("_"," ") for s in employee.closure(lambda s:s.hyponyms()) for w in s.lemma_names]))
if 'waiter' in all_hyponyms_of_employee:
print 'employee more general than waiter'
elif 'employee' in all_hyponyms_of_waiter:
print 'waiter more general than employee'
else:
print "The SUMO ontology used in wordnet just doesn't have employee or waiter under the same tree"