Search code examples
pythonnlpnltksemanticswordnet

How do I extract the offset of a WordNet synset give a synset in Python NLTK?


A sense offset in WordNet is an 8 digit number followed by a POS tag. For example, the offset for the synset 'dog.n.01' is '02084071-n'. I have tried the following code:

    from nltk.corpus import wordnet as wn

    ss = wn.synset('dog.n.01')
    offset = str(ss.offset)
    print (offset)

However, I get this output:

    <bound method Synset.offset of Synset('dog.n.01')>

How do I get the actual offset in this format: '02084071-n'?


Solution

  • >>> from nltk.corpus import wordnet as wn
    >>> ss = wn.synset('dog.n.01')
    >>> offset = str(ss.offset()).zfill(8) + '-' + ss.pos()
    >>> offset
    u'02084071-n'