On my web server, I have a few rewrite rules added to my .htaccess file located in the document root (in my case, /var/www/main/public_html/.htaccess
). The purpose of these rewrite rules are to remove the need for .php and .html file extensions in the URL.
Here's the problem: It works perfectly for most URLs. However, when attempting to access subdirectories that are actually files, the rewrite rule keeps adding .php to the end until the server gives up and throws a 500
error.
To better explain the issue... Lets say my file struture looked like this:
/index.php
/file.php
/folder/index.php
/folder/file.php
These URLs display pages correctly:
example.com
example.com/index
example.com/file
example.com/folder
example.com/folder/index
example.com/folder/file
This strange looping happens however on a URL like this:
example.com/index/file
example.com/file/index
In the case of /index/file
, the logs show:
[core:debug] [pid 253721] core.c(3849): [client #####] AH00121: r->uri = /index/file.php.php.php.php.php.php.php.php.php.php
Expected result is that a URL such as example.com/index/file
would result in a 404
error
And everything else displays a 404 as expected:
example.com/badfile
example.com/folder/badfile
example.com/badfolder/file
I'm not super experienced with rewrite rules and don't really know what I'm doing :D I'm almost certain I've made a stupid mistake somewhere. Any help would be appreciated!
The server is self-hosted running on Ubuntu 20.04 with Apache2.
This is what my .htaccess file contains:
Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /
# External Redirect
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+)\.html [NC]
RewriteRule ^ %1 [R=302,L,NE]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+)\.php [NC]
RewriteRule ^ %1 [R=302,L,NE]
# Internal Redirect
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html [L]
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^ %{REQUEST_URI}.php [L]
# Error Documents
ErrorDocument 403 /errors/403.php
ErrorDocument 404 /errors/404.php
ErrorDocument 500 /errors/500.php
# Internal Redirect RewriteCond %{REQUEST_FILENAME}.html -f RewriteRule ^ %{REQUEST_URI}.html [L] RewriteCond %{REQUEST_FILENAME}.php -f RewriteRule ^ %{REQUEST_URI}.php [L]
The problem is your "internal redirect/rewrite". The condition (eg. %{REQUEST_FILENAME}.html
) is not necessarily testing the same thing that you are ultimately rewriting to in the substitution string (eg. %{REQUEST_URI}.html
) which results in an endless rewrite loop under certain scenarios. (Yes, using the END
flag - as mentioned in comments - would prevent the rewrite loop, however, this does not address the underlying issue and would still result in an incorrect rewrite and the "wrong URL" being logged as a 404.)
When requesting a URL of the form example.com/index/file
(where index
also maps to a .php
or .html
file and is not a physical directory) then the REQUEST_FILENAME
server variable is of the form /path/to/index
(note the missing /file
), but REQUEST_URI
is /index/file
(the requested URL). So, the condition (RewriteCond
directive) is successful (since /path/to/index.php
exists), but the request is incorrectly rewritten to /index/file.php
- which does not exist. The rewrite engine then starts over and the same issue occurs. (The END
flag causes the rewrite engine to stop after the first "incorrect" rewrite.)
These rules should be written like this instead:
# Internal Redirect
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.html -f
RewriteRule ^ %{REQUEST_URI}.html [L]
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.php -f
RewriteRule ^ %{REQUEST_URI}.php [L]
The preceding condition is now testing the same file-path that we are ultimately rewriting to in the substitution string.
For further reading/explanation/examples, see my answer to the following question on ServerFault (StackOverflow's sister site on server management):