Search code examples
apachereverse-proxymod-proxy-html

Apache Reverse Proxy Sending Browser to Backend Directly Instead


(UPDATE at the bottom for the main question, below may be superfluous details)

I'm having an interesting problem with Apache not reverse proxying as expected.

Basically, what's happening is when I click a link on my website that goes to the relative path /app1, I am expecting it the URL to be external.company.ca/app1 with content coming from internal.company.ca/some_app. Instead, the browser is going directly to internal.company.ca/some_app.

No 302 or anything, just straight there. This is odd to me, since internal.company.ca is not mentioned anywhere in the configuration except for the reverse proxy config, so I don't know how the browser is learning of the domain at all.

Here is a Fiddler capture from the client (browser) point of view showing the behaviour right after I click the link that goes to /app1 (you'll have to trust me that the green names are external.company.ca and the black names are internal.company.com and the path is /some_app/blahblah):

enter image description here

Everything happening after this point is loading the page with internal.company.com. This won't work at all in production, of course.

The following is a (truncated) version of our Apache configuration files for consideration:

<VirtualHost *:80>
    # rewrite rules to 443
</VirtualHost>

<VirtualHost *:443>
    ServerName external.company.ca
    ServerAlias external.company.com

    # Logging rules.........

    SSLEngine on
    SSLProxyEngine on
    SSLProxyVerify none

    # Most of this is off for testing purposes, adding in case it matters

    SSLProxyCheckPeerCN off
    SSLProxyCheckPeerName off
    SSLProxyCheckPeerExpire off

    # more SSL stuff.... Now on to the interesting part

    ProxyPreserveHost On
    ProxyPass /app1 https://internal.company.com/some_app
    ProxyPassReverse /app1 https://internal.company.com/some_app
</VirtualHost>

At one point, I thought that possibly the cookies were throwing things off since they were under different domains (.ca in front, .com in back), but I believe if the reverse proxying was working correctly, the browser would be none the wiser. Anyone see anything wrong with the above?

UPDATE

I found the culprit:

<script type="text/javascript">window.location.assign('https://internal.company.com/app1/login?redirectUrl=' + encodeURIComponent(window.location.pathname + window.location.hash));</script>

The problem is, how do I rewrite this absolute URL using Apache? I know mod_proxy_html modifies element attributes (such as href in the a element) but can it rewrite arbitrary data in an element itself?

The internal application was provided by a vendor, and although it may be possible to make modifications to it to remove code like the above, I would prefer to stay away from that path for now to see if there are alternatives.


Solution

  • I've come up with a somewhat nasty work-around:

    ProxyHTMLEnable On
    ProxyHTMLExtended On
    ProxyHTMLLinks script src
    ProxyHTMLURLMap https://internal.company.com
    

    The problem is the use of absolute URL's throughout the HTML (and javascript) coming from the vendor's app. A search and removal of the domain solves the problem (but is incredibly slow).

    If anyone has this problem in the future, I do not recommend using this solution. I'm guessing you're here because you can't modify the internal application. You should instead be sending in a ticket to whoever maintains the code to make their application more reverse-proxy friendly.