Search code examples
python-3.xhtml-escape-characters

HTML Unescape is not unescaping special characters


My program does not unescape the HTML special characters for quotes and I can't figure out why. It still displays the special characters in the Terminal.

For example: 'In the comic book "Archie"

import requests
import html

API_URL = "https://opentdb.com/api.php"

parameters = {
    "amount": 10,
    "type": "boolean"
}

response = requests.get(API_URL, params=parameters)
data = html.unescape(response.json())
unescaped_data = data["results"]
print(f"UNESCAPED DATA: {unescaped_data}") # THIS IS NOT WORKING

Solution

  • The result isn't unescaped because response.json() returns a JSON object (i.e. a dict) and not a string. If you wanted to, you could unescape the response string using html.unescape(response.text) but this will leave you with invalid JSON, e.g.: "question":""Windows NT" is a monolithic kernel.", (note the additional quotes). So the escaping is there for a reason and you will have to unescape only those parts that you really need, that is, the string components of your JSON object.