Search code examples
pythonpython-2.7pos-taggersenti-wordnet

to find the opinion of a sentence as positive or negative


i need to find the opinion of certain reviews given in websites. i am using sentiwordnet for this. i first send the file containing all the reviews to POS Tagger.

tokens=nltk.word_tokenize(line) #tokenization for line in file
tagged=nltk.pos_tag(tokens) #for POSTagging
print tagged

Is there any other accurate way of tokenizing which considers not good as 1 word other than considering it as 2 separate words.

Now i have to give postive and negative score to the tokenized words and then calculate the total score. Is there any function in sentiwordnet for this. please help.


Solution

  • See First Extract Adverbs and Adjectives from review for example:

    import nltk
    from nltk.tokenize import sent_tokenize, word_tokenize
    import csv
    
    para = "What can I say about this place. The staff of the restaurant is nice and the eggplant is not bad. Apart from that, very uninspired food, lack of atmosphere and too expensive. I am a staunch vegetarian and was sorely dissapointed with the veggie options on the menu. Will be the last time I visit, I recommend others to avoid"
    
    sentense = word_tokenize(para)
    word_features = []
    
    for i,j in nltk.pos_tag(sentense):
        if j in ['JJ', 'JJR', 'JJS', 'RB', 'RBR', 'RBS']: 
            word_features.append(i)
    
    rating = 0
    
    for i in word_features:
        with open('words.txt', 'rt') as f:
            reader = csv.reader(f, delimiter=',')
            for row in reader:
                if i == row[0]:
                    print i, row[1]
                    if row[1] == 'pos':
                        rating = rating + 1
                    elif row[1] == 'neg':
                        rating = rating - 1
    print  rating
    

    Now you must have a external csv file in which you should have positive and negative words

    like : wrinkle,neg wrinkled,neg wrinkles,neg masterfully,pos masterpiece,pos masterpieces,pos

    Working of the above script as follows:

    1 . read sentence 2 . extract adverb and adjectives 3 . compare to CVS for positive and negative words 4 . and then rate the sentence

    Result of above script is :

    nice pos  
    bad neg  
    expensive neg  
    sorely neg  
    -2
    

    change result as per your need. and sorry for my english :P