Search code examples
httpwebproxyload-balancingw3c

How does web application restore original URL after HTTP proxy or load balancer?


You deploy Web application (in my case Java EE + Spring MVC, but I think it doesn't have matter what web-stack is used) and hide it behind HTTP proxy or load balancer.

Proxy/balancer software can fix HTTP headers. This is not question.

But application itself put links into generated HTML:

<a href="http://$HOST:$PORT/$CONTEXT/$PATH">...</a>
<a href="/$CONTEXT/$PATH">...</a>

Proxy/balancer can use different $HOST:$PORT or $CONTEXT part. In case of Java EE with JSP this piece of code fix this issue:

<c:url value="$PATH">
${pageContext.request.contextPath}/$PATH

I want to know how Web framework gets knowledge about user requested $HOST:$PORT/$CONTEXT so it can be rendered in HTML?

Is this info extracted from:

http://en.wikipedia.org/wiki/X-Forwarded-For

non-standard de-facto tag? It look like:

X-Forwarded-For: client, proxy1, proxy2, ..., proxyN

so web framework extract second argument (which is proxy1 in my example, or host IP if N == 0) to provide to you $HOST:$PORT/$CONTEXT?


Solution

  • This is going to be dependent on your particular proxy or load balancer. X-Forwarded-For is very common, but it usually only tells you about the IP address of the original request.

    In AWS you can use three headers to construct more of the original URL:

    • X-Forwarded-For
    • X-Forwarded-Proto
    • X-Forwarded-Port

    Apache uses these and you can configure other custom headers with additional data:

    • X-Forwarded-For - The IP address of the client.
    • X-Forwarded-Host - The original host requested by the client in the Host HTTP request header.
    • X-Forwarded-Server - The hostname of the proxy server.

    In Azure, these headers are available:

    • x-original-host
    • x-original-path

    Bottom line is that there is no standard way of re-constructing the original URL. You will have to use the documentation of whatever proxy you are using. Some data may be missing. In some cases you may be able to configure the proxy to send missing data in custom headers.