Search code examples
apache.htaccessmod-rewrite

mod-rewrite: Rewrite rule constantly adds .php to the end of some URLs


On my web server, I have a few rewrite rules added to my .htaccess file located in the document root (in my case, /var/www/main/public_html/.htaccess). The purpose of these rewrite rules are to remove the need for .php and .html file extensions in the URL.

Here's the problem: It works perfectly for most URLs. However, when attempting to access subdirectories that are actually files, the rewrite rule keeps adding .php to the end until the server gives up and throws a 500 error.

To better explain the issue... Lets say my file struture looked like this: /index.php /file.php /folder/index.php /folder/file.php

These URLs display pages correctly: example.com example.com/index example.com/file example.com/folder example.com/folder/index example.com/folder/file

This strange looping happens however on a URL like this: example.com/index/file example.com/file/index In the case of /index/file, the logs show: [core:debug] [pid 253721] core.c(3849): [client #####] AH00121: r->uri = /index/file.php.php.php.php.php.php.php.php.php.php Expected result is that a URL such as example.com/index/file would result in a 404 error

And everything else displays a 404 as expected: example.com/badfile example.com/folder/badfile example.com/badfolder/file

I'm not super experienced with rewrite rules and don't really know what I'm doing :D I'm almost certain I've made a stupid mistake somewhere. Any help would be appreciated!

The server is self-hosted running on Ubuntu 20.04 with Apache2.

This is what my .htaccess file contains:

Options +FollowSymLinks -MultiViews
RewriteEngine On
RewriteBase /

# External Redirect
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+)\.html [NC]
RewriteRule ^ %1 [R=302,L,NE]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s([^.]+)\.php [NC]
RewriteRule ^ %1 [R=302,L,NE]

# Internal Redirect
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^ %{REQUEST_URI}.html [L]
RewriteCond %{REQUEST_FILENAME}.php -f
RewriteRule ^ %{REQUEST_URI}.php [L]

# Error Documents
ErrorDocument 403 /errors/403.php
ErrorDocument 404 /errors/404.php
ErrorDocument 500 /errors/500.php

Solution

  • # Internal Redirect
    RewriteCond %{REQUEST_FILENAME}.html -f
    RewriteRule ^ %{REQUEST_URI}.html [L]
    RewriteCond %{REQUEST_FILENAME}.php -f
    RewriteRule ^ %{REQUEST_URI}.php [L]
    

    The problem is your "internal redirect/rewrite". The condition (eg. %{REQUEST_FILENAME}.html) is not necessarily testing the same thing that you are ultimately rewriting to in the substitution string (eg. %{REQUEST_URI}.html) which results in an endless rewrite loop under certain scenarios. (Yes, using the END flag - as mentioned in comments - would prevent the rewrite loop, however, this does not address the underlying issue and would still result in an incorrect rewrite and the "wrong URL" being logged as a 404.)

    When requesting a URL of the form example.com/index/file (where index also maps to a .php or .html file and is not a physical directory) then the REQUEST_FILENAME server variable is of the form /path/to/index (note the missing /file), but REQUEST_URI is /index/file (the requested URL). So, the condition (RewriteCond directive) is successful (since /path/to/index.php exists), but the request is incorrectly rewritten to /index/file.php - which does not exist. The rewrite engine then starts over and the same issue occurs. (The END flag causes the rewrite engine to stop after the first "incorrect" rewrite.)

    These rules should be written like this instead:

    # Internal Redirect
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.html -f
    RewriteRule ^ %{REQUEST_URI}.html [L]
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}.php -f
    RewriteRule ^ %{REQUEST_URI}.php [L]
    

    The preceding condition is now testing the same file-path that we are ultimately rewriting to in the substitution string.

    For further reading/explanation/examples, see my answer to the following question on ServerFault (StackOverflow's sister site on server management):