How can I print out the main lemma of a WordNet synset? Python NLTK

I have a large set of WordNet synsets. A small portion of this set is:

syns = {"Synset('brutal.s.04')", "Synset('benignant.s.02')"}

I want to print out the synset term (the main lemma of the synset) for each synset in the set. For example, the output of the above set should be:

brutal, benignant

This is the code I used:

    from nltk.corpus import wordnet as wn
    for s in syns:
        print(wn.s.lemmas[0])

but this does not work, because s is considered a string, and not an object. I get the following error:

AttributeError: 'WordNetCorpusReader' object has no attribute 's'

This is because s is seen as a string, and not as an object. I tried to change s to byte form like so:

    s = bytes(s)

But that does not work. How can I print out only the lemma as mentioned above, in the simplest way?

I checked here, and this is a good way to do it, but my set of synsets are in string form, and not actually objects.

Thanks in advance..

Solution

TL;DR

>>> syns = {"Synset('brutal.s.04')", "Synset('benignant.s.02')"}
>>> [wn.synset(i[8:-2]) for i in syns]
[Synset('benignant.s.02'), Synset('brutal.s.04')]
>>> syns = [wn.synset(i[8:-2]) for i in syns]
>>> syns[0].lemma_names()
[u'benignant', u'gracious']

Firstly to get an input with the type printed out in strings is weird. So the first intuitive approach would be do something like ast.literal_eval() or eval() with the Synset type, https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L305 (but before that see http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html):

>>> from nltk.corpus.reader.wordnet import Synset
>>> from nltk.corpus import wordnet as wn
>>> syns = {"Synset('brutal.s.04')", "Synset('benignant.s.02')"}
>>> [eval(i) for i in syns]
[Synset('None'), Synset('None')]

Apparently, Synset class won't work independent of the nltk.corpus.wordnet. So we take a look at the wordnet.synset() function instead (https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L1217). It seems like it only takes the pre-assigned name of a Synset object, so:

>>> wn.synset('brutal.s.04')
Synset('brutal.s.04')
>>> type(wn.synset('brutal.s.04'))
<class 'nltk.corpus.reader.wordnet.Synset'>

And after which when the pseudo string synset in your input syns becomes a Synset, you can easily control the Synset as what is shown How do I print out just the word itself in a WordNet synset using Python NLTK?

Back to your weird input syns, doing the following will give me the name of the synset:

>>> syns = {"Synset('brutal.s.04')", "Synset('benignant.s.02')"}
>>> list(syns)[0]
"Synset('benignant.s.02')"
>>> list(syns)[0][8:-2]
'benignant.s.02'

So back to converting it into a Synset:

>>> syns = {"Synset('brutal.s.04')", "Synset('benignant.s.02')"}
>>> [wn.synset(i[8:-2]) for i in syns]
[Synset('benignant.s.02'), Synset('brutal.s.04')]
>>> syns = [wn.synset(i[8:-2]) for i in syns]
>>> syns[0].lemma_names()
[u'benignant', u'gracious']

But let's roll back altogether, you're getting a weird input syns because someone has saved their output by simply casting a str() to a Synset object:

>>> syns[0]
Synset('benignant.s.02')
>>> str(syns[0])
"Synset('benignant.s.02')"

The person could have simply done:

>>> syns[0].name()
u'benignant.s.02'

Which then your input syns object will look like this:

syns = {u'brutal.s.04', u'benignant.s.02'}

and to read it, you can simply do:

>>> from nltk.corpus import wordnet as wn
>>> syns = {u'brutal.s.04', u'benignant.s.02'}
>>> syns = [wn.synset(i) for i in syns]
>>> syns[0]
Synset('brutal.s.04')
>>> syns[0].lemma_names()
[u'brutal']