Search code examples
pythonurllib2

I can not get the whole url when the server redirect me by using urllib2.urlopen(url).geturl()


For Example, I can only get 'http://www.stackoverflow.com' if the whole url is 'http://www.stackoverflow.com?key=value&key1=value1'.


Solution

  • urllib2 does not strip the query string after a redirect:

    >>> import urllib2
    >>> r = urllib2.urlopen('http://httpbin.org/redirect-to?url=http://example.com/%3Ffoo=bar')
    >>> r.geturl()
    'http://example.com/?foo=bar'
    

    Perhaps you are using a website that redirects you again on requests with a query string?

    You could use the requests library instead; you can either disable redirects altogether, or you can introspect the history of redirections:

    >>> import requests 
    >>> r = requests.get('http://httpbin.org/relative-redirect/4')
    >>> r.history
    [<Response [302]>, <Response [302]>, <Response [302]>, <Response [302]>]
    >>> r.history[2].url
    u'http://httpbin.org/relative-redirect/2'