I just extracted my all my tweeter history in a json file. so I want to do some data analysis on the tweets with python. I open the terminal and and entered the following commands to dump json from from python.
>>> import json
>>> with open('tweet.js') as json_file:
... data = json.load(json_file)
... print(data)
and got this "traceback" error
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "C:\Users\George\AppData\Local\Programs\Python\Python38-32\lib\json\__init__.py", line 293, in load
return loads(fp.read(),
File "C:\Users\George\AppData\Local\Programs\Python\Python38-32\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 4771: character maps to <undefined>
the json file is name tweet.js and it follows this form
{
"retweeted" : false,
"source" : "<a href=\"http://twitter.com/download/android\" rel=\"nofollow\">Twitter for Android</a>",
"entities" : {
"hashtags" : [ ],
"symbols" : [ ],
"user_mentions" : [ {
"name" : "Florin Pop \uD83D\uDC68\uD83C\uDFFB\uD83D\uDCBB",
"screen_name" : "florinpop1705",
"indices" : [ "0", "14" ],
"id_str" : "861320851",
"id" : "861320851"
} ],
"urls" : [ ]
},
"display_text_range" : [ "0", "155" ],
"favorite_count" : "0",
"in_reply_to_status_id_str" : "1194246195243302913",
"id_str" : "1200417547524493312",
"in_reply_to_user_id" : "861320851",
"truncated" : false,
"retweet_count" : "0",
"id" : "1200417547524493312",
"in_reply_to_status_id" : "1194246195243302913",
"created_at" : "Fri Nov 29 14:13:40 +0000 2019",
"favorited" : false,
"full_text" : "@florinpop1705 I've heard good things about it, but never tried it.... Using kdenlive is simple yet some things are difficult to implement like text effect",
"lang" : "en",
"in_reply_to_screen_name" : "florinpop1705",
"in_reply_to_user_id_str" : "861320851"
}
This solution will give you output,encoding="utf8" must be added.You specify the encoding when you open the file:
import json
with open("tweet.json", encoding="utf8") as json_file:
data = json.load(json_file)
print(data)