Search code examples
pythonnumpytweepysentiment-analysis

How to solve 'numpy.float64' object has no attribute 'encode' in python 3


I am trying to do a sentiment analysis in twitter about different car brands,i am using python 3 for this.While running the code i am getting the below exception

Traceback (most recent call last):
File "C:\Users\Jeet Chatterjee\NLP\Maruti_Toyota_Marcedes_Brand_analysis.py", line 55, in <module>
x = str(x.encode('utf-8','ignore'),errors ='ignore')
AttributeError: 'numpy.float64' object has no attribute 'encode'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\Jeet Chatterjee\NLP\Maruti_Toyota_Marcedes_Brand_analysis.py", line 62, in <module>
tweets.set_value(idx,column,'')
  File "C:\Program Files (x86)\Python36-32\lib\site-packages\pandas\core\frame.py", line 1856, in set_value
engine.set_value(series._values, index, value)
 File "pandas\_libs\index.pyx", line 116, in pandas._libs.index.IndexEngine.set_value (pandas\_libs\index.c:4690)
File "pandas\_libs\index.pyx", line 130, in pandas._libs.index.IndexEngine.set_value (pandas\_libs\index.c:4578)
File "pandas\_libs\src\util.pxd", line 101, in util.set_value_at (pandas\_libs\index.c:21043)
  File "pandas\_libs\src\util.pxd", line 93, in util.set_value_at_unsafe (pandas\_libs\index.c:20964)
 ValueError: could not convert string to float: 

I don't how to represent the encode in python 3 . And here is my code

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from textblob import TextBlob
import json
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
#regular expression in python
import re

#data corpus
tweets_data_path = 'carData.txt'
tweets_data = []
tweets_file = open(tweets_data_path, "r")
for line in tweets_file:
    try:
        tweet = json.loads(line)
        tweets_data.append(tweet)
    except:
        continue
#creating panda dataset        
tweets = pd.DataFrame()
index = 0
    for num, line in enumerate(tweets_data):
  try:

     print (num,line['text'])

     tweets.loc[index,'text'] = line['text']
     index = index + 1 
  except:
         print(num, "line not parsed")
         continue

   def brand_in_tweet(brand, tweet):
       brand = brand.lower()
       tweet = tweet.lower()
       match = re.search(brand, tweet)
       if match:
        print ('Match Found')
        return brand
    else:
        print ('Match not found')
        return 'none'
for index, row in tweets.iterrows():
temp = TextBlob(row['text'])
tweets.loc[index,'sentscore'] = temp.sentiment.polarity

  for column in tweets.columns:
  for idx in tweets[column].index:
    x = tweets.get_value(idx,column)
    try:
        x = str(x.encode('utf-8','ignore'),errors ='ignore')          
        if type(x) == unicode:
            str(str(x),errors='ignore')
        else: 
            df.set_value(idx,column,x)
    except Exception:
        print ('encoding error: {0} {1}'.format(idx,column))
        tweets.set_value(idx,column,'')
        continue
tweets.to_csv('tweets_export.csv')

if __name__=='__main__':

  brand_in_tweet()

I have posted the full code , i am not getting any clue about this error , that how to solve this . Please help and thanks in advance .


Solution

  • There is a problem in this line:

     x = str(x.encode('utf-8','ignore'),errors ='ignore')  
    

    x is a numpy.float64. The code is trying to first encode it as utf8, then convert it to a string. But that is the wrong way around, because only strings can be encoded. First convert it to a string, then encode the string:

     x = str(x).encode('utf-8','ignore')