Search code examples
pythonurlparse

urlparse.urlparse returning 3 '/' instead of 2 after scheme


I'd like to add the 'http' scheme name in front of a given url string if it's missing. Otherwise, leave the url alone so I thought urlparse was the right way to do this. But whenever there's no scheme and I use get url, I get /// instead of '//' between the scheme and domain.

>>> t = urlparse.urlparse('www.example.com', 'http')
>>> t.geturl()
'http:///www.example.com' # three ///

How do I convert this url so it actually looks like:

'http://www.example.com' # two //

Solution

  • Short answer (but it's a bit tautological):

    >>> urlparse.urlparse("http://www.example.com").geturl()
    'http://www.example.com'
    

    In your example code, the hostname is parsed as a path not a network location:

    >>> urlparse.urlparse("www.example.com/go")
    ParseResult(scheme='', netloc='', path='www.example.com/go', params='', \
        query='', fragment='')
    
    >>> urlparse.urlparse("http://www.example.com/go")
    ParseResult(scheme='http', netloc='www.example.com', path='/go', params='', \
        query='', fragment='')