In Python 2.7, I want to open a URL which contains accents (the link itself, not the page to which it's pointing). If I use the following:
#!/usr/bin/env Python
# -*- coding: utf-8 -*-
import urllib2
test = "https://www.notifymydevice.com/push?ApiKey=K6HGFJJCCQE04G29OHSRBIXI&PushTitle=Les%20accents%20:%20éèçà&PushText=Messages%20éèçà&"
urllib2.urlopen(test)
My accents are converted to gibberish (Ã, ¨, ©, etc rather than the éèà I expect).
I've searched for that kind of issue and so I tried with urllib2.urlopen(test.encode('utf-8')) but Python throws an error in that case:
File "test.py", line 10, in urllib2.urlopen(test.encode('utf8')) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 98: ordinal not in range(128)
Prefix the string with a u
. I get no errors trying it out in repl using this
import urllib
test = u'https://www.notifymydevice.com/push?ApiKey=K6HGFJJCCQE04G29OHSRBIXI&PushTitle=Les%20accents%20:%20éèçà&PushText=Messages%20éèçà&'
urllib.urlopen(test.encode("UTF-8"))
The u
prefix is for unicode strings