How to find abstractness of a word using hyper-/hyponyms in wordnet?

I have 2 words, let's say computer and tool. Computer is a concrete noun whereas tool is relatively abstract. I want to get level of abstractness of each word that will reflect this. I thought the best way to do it is by counting number of hyper/hypo nyms for each word.

Is it possible?
Is there a better way to do it?

Solution

The first problem is which meaning of `computer` would you refer to?

In WordNet, a word has different "concepts", aka synsets:

>>> from nltk.corpus import wordnet as wn

>>> wn.synsets('computer')
[Synset('computer.n.01'), Synset('calculator.n.01')]

>>> wn.synsets('computer')[0].definition()
'a machine for performing calculations automatically'
>>> wn.synsets('computer')[1].definition()
'an expert at calculation (or at operating calculating machines)'

And hyper/hyponyms are not connected to the word `computer`

The hyper/hyponyms are concepts, i.e. synsets too, so it's not connected to the form/word but to the possible synsets that might be represented by the word computer, i.e.

>>> type(wn.synsets('computer')[0])
<class 'nltk.corpus.reader.wordnet.Synset'>

>>> wn.synsets('computer')[0].hypernyms()
[Synset('machine.n.01')]

>>> wn.synsets('computer')[0].hyponyms()
[Synset('analog_computer.n.01'), Synset('digital_computer.n.01'), Synset('home_computer.n.01'), Synset('node.n.08'), Synset('number_cruncher.n.02'), Synset('pari-mutuel_machine.n.01'), Synset('predictor.n.03'), Synset('server.n.03'), Synset('turing_machine.n.01'), Synset('web_site.n.01')]

Yes that's a lot of information but how do I get hyper/hyponyms for words?

According to the definition, should words have hyper/hyponyms? Or should concept have hypo/hypernyms?

Fine, you're bringing me in circles... Just tell me how to use hyper-/hyponyms to see if a word is more abstract than another word!!!

Okay, then we have to make some assumption.

Lets consider all synsets of a word accessed through the WordNet as a "holistic" concept of any word form
We consider the sum of all DIRECT hyper-/hyponyms of all synsets of a given word
Based on the number of hyper-/hyponyms of all synsets that can be represented by a certain word form, we deduce that word X is more/less abstract than word Y

But how to do (1), (2) and (3) in the code?

>>> hypernym_count = lambda word: sum(len(ss.hypernyms()) for ss in wn.synsets(word)) 
>>> hyponym_count = lambda word: sum(len(ss.hyponyms()) for ss in wn.synsets(word)) 

>>> hyponym_count('computer')
14
>>> hypernym_count('computer')
2


>>> hypernym_count('tool')
8
>>> hyponym_count('tool')
32

Since (3) is your hypothesis that you want to test, you should be the one deciding what heuristics to deduce if a word is more/less abstract based on the hyponym_count and hypernym_count results

Wait a minute, what's `DIRECT` hyper-/hyponyms?

We're only accessing the hyper-/hyponyms one level above/below the synset. That's what "direct" means here.

Then how to get all the hyponyms below a synset, see https://stackoverflow.com/a/42012001/610569

So should I use direct or all hyponyms below or all all hypernyms above?

That's for you to find out and tell us =) Have fun!