I'm loading a JSON file made up of yelp restaurant reviews so that it removes Unicode characters this way:
def parse_yelp_restaurant_api(self, response):
jsonresponse = json.loads(response.text, strict=False)
Now I would like to also remove ASCII HTML characters. My JSON file is full of ''', '"', etc.
I solved the problem by using html.unescape
on the retrieved fields as suggested by Panagiotis Kanavos.
response.json()
(as suggested by puchal) also made things easier for unicode guessing.