Using this [x for x in wn.all_synsets('n')]
I am able to get a list allnouns
with all nouns from Wordnet with help from NLTK.
The list allnouns
looks like this Synset('pile.n.01'), Synset('compost_heap.n.01'), Synset('mass.n.03')
and so on. Now I am able to get any element by using allnouns[2]
and this should be Synset('mass.n.03')
.
I would like to extract only the word mass but for some reason I cannot treat it like a string and everything I try shows a AttributeError: 'Synset' object has no attribute
or TypeError: 'Synset' object is not subscriptable
or <bound method Synset.name of Synset('mass.n.03')>
if I try to use .name or .pos
How about trying this solution:
>>>> from nltk.corpus import wordnet as wn
>>>> wn.synset('mass.n.03').name().split(".")[0]
'mass'
For your case:
>>>> allnouns = [x for x in wn.all_synsets('n')]
The item at 23rd index is "Synset('substance.n.07')". Now, you can extract its name field like
>>>> allnouns[23].name().split(".")[0]
'substance' #output
If you want only the 'name' fields of the synsets of 'noun' category in the list, then use:
>>>> [x.name().split(".")[0] for x in wn.all_synsets('n')]
should exactly give the result you need.
Note: In wordnet, name
is not an attribute rather it is a function!