Search code examples
pythonurlparse

Python urlparse, correct or incorrect?


Python's urlparse function parses an url into six components (scheme, netloc, path and others stuff)

Now I've found that parsing "example.com/path/file.ext" return no netloc but a path "example.com/path/file.ext".

Should't it be netloc = "example.com" and path = "/path/file.ext"?

Do we really need a "://" to determine wether or not a netloc exists?

Python's ticket: http://bugs.python.org/issue8284


Solution

  • Without the scheme://, there's no guarantee that example.com is a domain. You could have a directory called example.com. Similarly, you could have a url 'omfgroflmao/path/file.ext', how would you know if 'omfgroflmao' is a machine on the local network (i.e. a netloc) or whether it's meant to be a path component?

    I can't see that the Python code is actually wrong, but perhaps the documentation needs to spell out explicitly the behaviour in such ambiguous circumstances (I haven't checked).