Search code examples
pythonhtml-escape-characters

How to convert Simplified Chinese characters to html escape characters with Python?


I want to get the html escape characters from Simplified Chinese (GB18030).

I tried using the Python library html.escape but it does not work.

For example, 宁波 to %C4%FE%B2%A8 and 江北 to %BD%AD%B1%B1.

How to solve this problem?

Thank you.


Solution

  • import urllib
    urllib.parse.quote('宁波', encoding='GB18030') == '%C4%FE%B2%A8'#True
    urllib.parse.quote('江北', encoding='GB18030') == '%BD%AD%B1%B1'#True