Search code examples
pythonwordnet

Get truth values out of repeating adjectives


I have an array with different texts. Some of them have repeating adjectives. Now I want to make an array out of this, that contanes truth values with 1 = text contains a repeating adjective and 0 = text does not contain a repeating adjective. This is a sample of my text:

text = (['When someone who is extremely selfish dramatically
 wonders why people are so selfish !', 'I asked God to 
protect me from my enemies .. shortly after I started losing friends'])

So far I tried to get the type of a word with wordnet

from nltk.corpus import wordnet as wn

my_list = []
for synset in list(wn.all_synsets('a')):
    my_list.append(synset)
my_list

truth_values = []
for sentence in text:
    for word in sentence:
        if word in my_list:
            truth_values.append(1)
from nltk.corpus import wordnet as wn

This code gives me the following error:

'str' object has no attribute '_name'

For the dublicate condition I thougt of a counter like

if counter >=1:
    truth_value.append(1)

Solution

  • I have a solution for you, so let's go through a few of the bugs that were existing in your code:

    Writing list(wn.all_synsets('a') will return a list of all adjectives as Synset objects, but what you really want is the string of the name of the adjective. Calling synset.name() returns data in this format: acroscopic.a.01. Since we only want the first part of that (and as a string), we will change

    for synset in list(wn.all_synsets('a')):
        my_list.append(synset)
    

    to

    for synset in list(wn.all_synsets('a')):
        my_list.append(str(synset.name()).split(".")[0])
    

    So now we have the desired list of all adjectives. Now, notice that the line

    for word in sentence:
    

    is parsing the individual characters in the sentence rather than words. What we want is

    for word in sentence.split(" "):
    

    All that said, here is how I would solve this problem:

    truth_values = []
    for sentence in text:
        adjectives = []
        for word in sentence.split(" "):
            if word in my_list:
                adjectives.append(word)
        truth_values.append(1 if any(adjectives.count(adj) > 1 for adj in adjectives) else 0)