Search code examples
apachenext.jsload-balancingreverse-proxy

Apache 2.4 redirection from load-balanced reverse-proxy not working but working in non-load-balanced reverse-proxy


I have 2 NextJS internal servers/applications that is reverse-proxied by Apache 2.4. Let's have the 2 internal servers as http://internal:3000/foo and http://internal:3001/foo and the external URL as http://external/foo.

When the NextJS is accessed on the base path (i.e. http://internal:3000/foo), it will be redirected to http://internal:3000/foo/bar/baz on HTTP Code 308. So over the reverse-proxy, I expected the same, that the redirection will happen from http://external/foo to http://external/foo/bar/baz.
The config in next.config.js is as follow

...
module.exports = {
  ...
  async redirects() {
    return [
      {
        source: '/',
        destination: '/bar/baz',
        permanent: true
      }
    ]
  },
  basePath: 'foo'
}

What happen is that this redirection works perfectly when I tried reverse-proxy to only 1 NextJS application without load balancing, e.g. I only reverse proxy to http://internal:3000/foo.
The config that I used is as follow

<Location "/foo">
    ProxyPass "http://localhost:3000/foo"
    ProxyPassReverse "http://localhost:3000/foo"
</Location>

But the redirection does not work when I tried reverse-proxy to 2 NextJS application on load balancing.
The config that I used is as follow

<Proxy "balancer://example">
    BalancerMember "http://localhost:3000/foo"
    BalancerMember "http://localhost:3001/foo"
</Proxy>

<Location "/foo">
    ProxyPass "balancer://example"
    ProxyPassReverse "balancer://example"
</Location>

What happen instead is that it will keep redirecting from http://external/foo to http://external/foo i.e. infinite redirect that result in TOO_MANY_REDIRECT.

It baffles me that the redirection works in non load-balancing scenario but failed when using load-balancing. Any ideas what is actually happening? Is there response header being written that I am unaware of when using proxy load-balancing? Thanks!

Update/progress (1):

I suspect there is rewriting happening in mod_proxy_balancer.c in following section

    access_status = rewrite_url(r, *worker, url);
    /* Add the session route to request notes if present */
    if (route) {
        apr_table_setn(r->notes, "session-sticky", sticky);
        apr_table_setn(r->notes, "session-route", route);


        /* Add session info to env. */
        apr_table_setn(r->subprocess_env,
                       "BALANCER_SESSION_STICKY", sticky);
        apr_table_setn(r->subprocess_env,
                       "BALANCER_SESSION_ROUTE", route);
    }
    ap_log_rerror(APLOG_MARK, APLOG_DEBUG, 0, r, APLOGNO(01172)
                  "%s: worker (%s) rewritten to %s",
                  (*balancer)->s->name, (*worker)->s->name, *url);


    return access_status;

which can be found in https://github.com/apache/httpd/blob/317108ee6e84ae47bd0f6121e3a64074c5d68c7b/modules/proxy/mod_proxy_balancer.c#L631-L647

Update/progress (2):

I turned on mod_dumpio to log all the incoming and outgoing traffic into and from apache, and have confirmed that there is indeed a rewriting happening.

The rewriting is happening as follow:

  1. GET /foo, which is the original request sent to the external server.
  2. GET /foo/, which is the request sent to the internal server. Notice there is rewriting of adding slash at the end.
  3. Location /foo, which is the internal server's redirection response location. This is an intended location because in internal server, GET /foo/ will be redirected to Location /foo while GET /foo will be redirected to Location /foo/bar/baz. Under normal circumstances the redirection will be handled by the internal server, meaning that GET /foo/ will result in redirection resulting in GET /foo, eventually yielding Location /foo/bar/baz but this does not happen in reverse-proxy.
  4. Location /foo, which is the external server's redirection response location. Because (4) and (1) is the same URL, therefore it will create a redirection loop.

With the rewriting confirmed, now I am looking whether there is way to solve this behavior.


Solution

  • I finally managed to fix the issue, and pertaining to the 2nd update above, I managed to find where the "rewriting" happen which helped me fix the issue.


    tl;dr
    The fix is basically moving the path from BalancerMember to the balancer itself, i.e.

    <Proxy "balancer://example/foo"> // used to be "balancer://example"
        BalancerMember "http://localhost:3000" // used to be "http://localhost:3000/foo"
        BalancerMember "http://localhost:3001" // used to be "http://localhost:3001/foo"
    </Proxy>
    
    // change to point to the balancer accordingly
    <Location "/foo">
        ProxyPass "balancer://example/foo"
        ProxyPassReverse "balancer://example/foo"
    </Location>
    

    As for my finding, the "rewriting" is not so much of rewriting but is actually URL canonicalisation performed by Apache, in mod_proxy_balancer or mod_proxy_http modules (or depends on your scheme). example of canonicalisation source code in mod_proxy_balancer

    Changing balancer://example -> balancer://example/foo makes its URL scheme structure wise same to http://localhost:3000/foo, which when undergoing canonicalisation will not yield trailing slash at the end, which is caused by balancer://example not having path right after the "host", and therefore with the said change the behaviour for reverse-proxy using either load balancer or not will finally be the same.