Search code examples
pythonutf-8urllib2

Python: open a URL with accent


In Python 2.7, I want to open a URL which contains accents (the link itself, not the page to which it's pointing). If I use the following:

#!/usr/bin/env Python
# -*- coding: utf-8 -*-

import urllib2


test = "https://www.notifymydevice.com/push?ApiKey=K6HGFJJCCQE04G29OHSRBIXI&PushTitle=Les%20accents%20:%20éèçà&PushText=Messages%20éèçà&"

urllib2.urlopen(test)

My accents are converted to gibberish (Ã, ¨, ©, etc rather than the éèà I expect).

I've searched for that kind of issue and so I tried with urllib2.urlopen(test.encode('utf-8')) but Python throws an error in that case:

File "test.py", line 10, in urllib2.urlopen(test.encode('utf8')) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 98: ordinal not in range(128)


Solution

  • Prefix the string with a u. I get no errors trying it out in repl using this

    import urllib
    test = u'https://www.notifymydevice.com/push?ApiKey=K6HGFJJCCQE04G29OHSRBIXI&PushTitle=Les%20accents%20:%20éèçà&PushText=Messages%20éèçà&'
    urllib.urlopen(test.encode("UTF-8"))
    

    The u prefix is for unicode strings