I have a WordPress website located at https://blog.example.com
and another site hosted separately in Azure App Service (Windows) at https://www.example.com
. Cloudflare sits in front of both of these sites.
I have set up a reverse proxy that points requests from https://www.example.com/blog
to https://blog.example.com
. This appears to be mostly working in that the blog posts appear under the expected URL (i.e. https://www.example.com/blog/a-blog-post
), but there are a few peculiarities that make me think something is not set up quite right:
blog.example.com
)https://blog.example.com
https://blog.example.com
https://blog.example.com
-> https://www.example.com/blog
, it goes into an infinite redirect loop.From my reading, I believe all of these problems are occurring because when the request is processed by server hosting WordPress, the Host
header is https://blog.example.com
rather than https://www.example.com
. There are several places (e.g. here) where WordPress uses the Host header to construct certain URLs, rather than the WordPress Website URL or Home URL (both set to https://www.example.com/blog
). Microsoft recommends preserving the original host header to resolve these problems.
Application Request Routing (ARR) on IIS has a preserveHostHeader
option that presumably be used to have the original host header be retained. I've tried enabling this but the proxy stops working entirely:
https://www.example.com/blog
(the root of the blog) shows me the https://www.example.com
homepagehttps://www.example.com/blog/a-blog-post
shows me a 404 (generated by the site at https://www.example.com
)Here is my existing set up:
applicationHost.xdt (to enable ARR on Azure App Service as it's disabled by default)
<configuration xmlns:xdt="http://schemas.microsoft.com/XML-Document-Transform">
<system.webServer>
<proxy xdt:Transform="InsertIfMissing" enabled="true" preserveHostHeader="false" reverseRewriteHostInResponseHeaders="false"/>
<rewrite xdt:Transform="InsertIfMissing">
<allowedServerVariables xdt:Transform="InsertIfMissing">
<add name="HTTP_X_ORIGINAL_HOST" xdt:Transform="InsertIfMissing" xdt:Locator="Match(name)"/>
<add name="HTTP_X_UNPROXIED_URL" xdt:Transform="InsertIfMissing" xdt:Locator="Match(name)"/>
<add name="HTTP_X_ORIGINAL_ACCEPT_ENCODING" xdt:Transform="InsertIfMissing" xdt:Locator="Match(name)"/>
<add name="HTTP_ACCEPT_ENCODING" xdt:Transform="InsertIfMissing" xdt:Locator="Match(name)"/>
</allowedServerVariables>
</rewrite>
</system.webServer>
</configuration>
web.config (to rewrite requests from subdirectory -> subdomain)
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<location path="." inheritInChildApplications="false">
<system.webServer>
<handlers>
<add name="aspNetCore" path="*" verb="*" modules="AspNetCoreModuleV2" resourceType="Unspecified" />
</handlers>
<aspNetCore processPath="dotnet" arguments=".\Example.dll" stdoutLogEnabled="false" stdoutLogFile=".\logs\stdout" hostingModel="inprocess" />
<rewrite>
<rules>
<clear />
<rule name="Blog Proxy" stopProcessing="false">
<match url="^blog(?:$|/)(.*)" />
<action type="Rewrite" url="https://blog.example.com/{R:1}" appendQueryString="true" logRewrittenUrl="false" />
<serverVariables>
<set name="HTTP_X_UNPROXIED_URL" value="https://blog.example.com/{R:1}" />
<set name="HTTP_X_ORIGINAL_ACCEPT_ENCODING" value="{HTTP_ACCEPT_ENCODING}" />
<set name="HTTP_X_ORIGINAL_HOST" value="{HTTP_HOST}" />
<set name="HTTP_ACCEPT_ENCODING" value="" />
</serverVariables>
</rule>
</rules>
</rewrite>
</system.webServer>
</location>
</configuration>
This appears to be a pretty standard setup for reverse proxies, but alas. Is this because I'm running behind Cloudflare? Does preserveHostHeader
not work with Azure App Services? How can I set this reverse proxy up so that it handles my use case?
The issue turned out to be how the web hosts had set up their environment.
The way the server is configured, when a request comes in for a specific hostname, the vhost files is checked for that hostname. When a matching one is found, content is loaded from the DocRoot set in the vhost file.
In my situation, there was no vhost file configured to handle requests for www.example.com. In this instance, the default catchall vhost file would then forward this request to the catchall DocRoot on the server and so when we appended /blog at the end, the path didn't exist for the catchall DocRoot so it's returning a 404. Adding www.example.com as a pointer domain on the account told the server which DocRoot to serve requests out of for that hostname appeared to fix the issue I was dealing with regarding the 404 response.
I think perhaps the 'pointer domain' concept is unique to this particular web host, this question or answer may not be that applicable generally. Hopefully it gives somewhere to look if you're experiencing the same thing though.