I have this string which I get from web scrapping. I want to convert this hex code to normal text. I use encode("utf-8")still it is not working
text = 'Hospital Nossa Senhora da Conceição, Porto Alegre, Brazil,Hospital de Base São José do Rio Preto, São José Do Rio Preto, Brazil'
text = text.encode("ut-8")
The expected output must be Hospital Nossa Senhora da Conceição, Porto Alegre, Brazil, Hospital de Base São José do Rio Preto, São José Do Rio Preto
I also tried
text.encode('utf-8').decode('unicode-escape')
but still it is not working. Could anyone help in this?
Apply html
— HyperText Markup Language support.
This module defines utilities to manipulate HTML.
…
html.unescape(s)
Convert all named and numeric character references (e.g.
>
,>
,>
) in the string s to the corresponding Unicode characters. This function uses the rules defined by the HTML 5 standard for both valid and invalid character references, and the list of HTML 5 named character references.New in version 3.4.
import html
text = 'Hospital Nossa Senhora da Conceição, Porto Alegre, Brazil,Hospital de Base São José do Rio Preto, São José Do Rio Preto, Brazil'
unescaped_text = html.unescape(text)
print( unescaped_text)
Output: .\SO\72657237.py
Hospital Nossa Senhora da Conceição, Porto Alegre, Brazil,Hospital de Base São José do Rio Preto, São José Do Rio Preto, Brazil