The background (but is not a Django-only question) is that the Django test server does not return a scheme or netloc in its response and request urls.
I get /foo/bar
for example, and I want to end up with http://localhost:8000/foo/bar
.
urllib.parse.urlparse
(but not so much urllib.parse.urlsplit
) makes gathering the relevant bits of information, from the test url and my known server address, easy. What seems more complicated than necessary is recomposing a new url with the scheme and netloc added via urllib.parse.urlcompose which wants positional arguments, but does not document what they are, nor support named arguments. Meanwhile, the parsing functions return immutable tuples...
def urlunparse(components):
"""Put a parsed URL back together again. This may result in a ..."""
I did get it working, see code below, but it looks really kludgy, around the part where I need to first transform the parse tuples into lists and then modify the list at the needed index position.
Is there a more Pythonic way?
from urllib.parse import urlsplit, parse_qs, urlunparse, urlparse, urlencode, ParseResult, SplitResult
server_at_ = "http://localhost:8000"
url_in = "/foo/bar" # this comes from Django test framework I want to change this to "http://localhost:8000/foo/bar"
from_server = urlparse(server_at_)
print(" scheme and netloc from server:",from_server)
print(f"{url_in=}")
from_urlparse = urlparse(url_in)
print(" missing scheme and netloc:",from_urlparse)
#this works
print("I can rebuild it unchanged :",urlunparse(from_urlparse))
#however, using the modern urlsplit doesnt work (I didn't know about urlunsplit when asking)
try:
print("using urlsplit", urlunparse(urlsplit(url_in)))
#pragma: no cover pylint: disable=unused-variable
except (Exception,) as e:
print("no luck with urlsplit though:", e)
#let's modify the urlparse results to add the scheme and netloc
try:
from_urlparse.scheme = from_server.scheme
from_urlparse.netloc = from_server.netloc
new_url = urlunparse(from_urlparse)
except (Exception,) as e:
print("can't modify tuples:", e)
# UGGGH, this works, but is there a better way?
parts = [v for v in from_urlparse]
parts[0] = from_server.scheme
parts[1] = from_server.netloc
print("finally:",urlunparse(parts))
scheme and netloc from server: ParseResult(scheme='http', netloc='localhost:8000', path='', params='', query='', fragment='')
url_in='/foo/bar'
missing scheme and netloc: ParseResult(scheme='', netloc='', path='/foo/bar', params='', query='', fragment='')
I can rebuild it unchanged : /foo/bar
no luck with urlsplit though: not enough values to unpack (expected 7, got 6)
can't modify tuples: can't set attribute
finally: http://localhost:8000/foo/bar
If you need it in Django then I found request.build_absolute_uri() in question
How can I get the full/absolute URL (with domain) in Django? - Stack Overflow
I didn't test it but maybe it resolves this problem in Django.
Other modules/frameworks may have also own functions for this.
As I rembeber module scrapy
for scraping HTML has own function response.urljoin()
to convert relative url into absolute url.
As for functions in module urllib
:
You would have to use
urlsplit
with urlunsplit
(which use less values)urlparse
with urlunparse
(which use more values)There is "hidden" function _replace()
which creates new ParseResult with replaced values.
new_urlparse = from_urlparse._replace(scheme=from_server.scheme, netloc=from_server.netloc)
Usually I need only urljoin()
server_at_ = "http://localhost:8000" # base
url_in = "/foo/bar" # relative url
absolute_url = urljoin(server_at, url_in)