Search code examples
pythonencodingurllib

Python encoding characters with urllib.quote


I'm trying to encode non-ASCII characters so I can put them inside an url and use them in urlopen. The problem is that I want an encoding like JavaScript (that for example encodes ó as %C3%B3):

encodeURIComponent(ó)
'%C3%B3'

But urllib.quote in python returns ó as %F3:

urllib.quote(ó)
'%F3'

I want to know how to achieve an encoding like javascript's encodeURIComponent in Python, and also if I can encode non ISO 8859-1 characters like Chinese. Thanks!


Solution

  • You want to make sure you're using unicode.

    Example:

    import urllib
    
    s = u"ó"
    print urllib.quote(s.encode("utf-8"))
    

    Outputs:

    %C3%B3