How to parse a utf-8 encoded query parameter with Python 2.6

I have some lovely (Scandinavian?) user on my website complaining that I cannot parse his username in URLs, and hence I am showing him no results on his page on my website.

I am pretty sure that the browser encodes the requests as http://councilroom.com/player?player=G%C3%B6rling

I'd like to get the player string to become Görling rather than GÃ¶rling that is getting converted to.

I am using web.py with python 2.6 and attempting to parse the URL as follows

parsed_url = urlparse.urlparse(web.ctx.fullpath)
query_dict = dict(urlparse.parse_qsl(parsed_url.query))
target_player = query_dict['player']

Edit: With the help of unutbu, I fixed this by changing it to

query_dict = dict(urlparse.parse_qsl(web.ctx.env['QUERY_STRING']))
target_player = query_dict['player'].decode('utf-8')

I think webpy was mis-parsing the fullpath in web.ctx somehow, but the QUERY_STRING variable is unmolested.

Solution

In [4]: import urlparse

In [6]: parsed_url = urlparse.urlparse('http://councilroom.com/player?player=G%C3%B6rling')

In [7]: parsed_url
Out[7]: ParseResult(scheme='http', netloc='councilroom.com', path='/player', params='', query='player=G%C3%B6rling', fragment='')

In [8]: query_dict = dict(urlparse.parse_qsl(parsed_url.query))

In [9]: query_dict
Out[9]: {'player': 'G\xc3\xb6rling'}

Note the .decode('utf-8'):

In [10]: target_player = query_dict['player'].decode('utf-8')

In [11]: target_player
Out[11]: u'G\xf6rling'

In [12]: print(target_player)
Görling

PS. Somehow, the bytes in the str object 'G\xc3\xb6rling' were being interpreted as a sequence of unicode code points, with the effect of turning Görling into GÃ¶rling:

In [3]: print(u'G\xc3\xb6rling')
GÃ¶rling