Search code examples
apache.htaccesshttp-redirectmod-rewrite

.htaccess no .php and no trailing slash


I have a simple website made out of .php files, no database. I used htaccess to enforce following rules:

  • force https
  • force www at beginning
  • remove .php extension (https://www.example.com/page.php -> https://www.example.com/page)
  • no trailing slash

This last rule, no trailing slash, does not work. Instead it leads to an error 404. That is the problem I am trying to solve. If someone opens https://www.example.com/page/ I want it to redirect to https://www.example.com/page and not give a 404.

Here is the relevant .htaccess lines I'm currently using. It is based on html5 boilerplate htaccess with added copy paste snippets because I have no .htaccess knowledge.

ErrorDocument 404 /errors/404.php

Options -MultiViews

<IfModule mod_rewrite.c>

    # (1)
    RewriteEngine On

    # (2)
    Options +FollowSymlinks

</IfModule>

# to https

<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteCond %{HTTPS} !=on
  RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R] # to 301
</IfModule>

# force www at beginning

<IfModule mod_rewrite.c>

  RewriteEngine On

#     # (1)
  RewriteCond %{HTTPS} =on
  RewriteRule ^ - [E=PROTO:https]
  RewriteCond %{HTTPS} !=on
  RewriteRule ^ - [E=PROTO:http]

#     # (2)
#     # RewriteCond %{HTTPS} !=on

  RewriteCond %{HTTP_HOST} !^www\. [NC]
  RewriteCond %{SERVER_ADDR} !=127.0.0.1
  RewriteCond %{SERVER_ADDR} !=::1
  RewriteRule ^ %{ENV:PROTO}://www.%{HTTP_HOST}%{REQUEST_URI} [L,R] # <- for test, for prod use [L,R=301]

</IfModule>

# remove .php

<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule ^([^\.]+)$ $1.php [NC,L]
</IfModule>

# remove trailing /

<IfModule mod_rewrite.c>
  RewriteEngine On
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteCond %{REQUEST_URI} (.*)/$
  RewriteRule ^(.*)/$ $1 [L,R] # <- for test, for prod use [L,R=301]
</IfModule>

Solution

  • # remove .php
    
    <IfModule mod_rewrite.c>
      RewriteEngine On
      RewriteCond %{REQUEST_FILENAME} !-f
      RewriteRule ^([^\.]+)$ $1.php [NC,L]
    </IfModule>
    
    # remove trailing /
    
    <IfModule mod_rewrite.c>
      RewriteEngine On
      RewriteCond %{REQUEST_FILENAME} !-f
      RewriteCond %{REQUEST_URI} (.*)/$
      RewriteRule ^(.*)/$ $1 [L,R] # <- for test, for prod use [L,R=301]
    </IfModule>
    

    These rules are in the wrong order. A request for /page/ is first rewritten to /page/.php (which naturally results in a 404) before you are removing the trailing slash (it no longer has a trailing slash since it ends in .php).

    However, your rule to remove the trailing slash should be checking that the request is not a directory, not that it is not a file. And the second condition that checks against REQUEST_URI is superfluous. You are also missing a slash prefix on the substitution string (and there is no RewriteBase defined), so this would have resulted in a malformed redirect.

    However, your rules can be greatly simplified. No need for the <IfModule> wrappers or multiple RewriteEngine On directives and the non-www to www redirect unnecessarily preserves the scheme (HTTP or HTTPS), when it is always HTTPS.

    Your rules could be written more succinctly like this:

    ErrorDocument 404 /errors/404.php
    
    Options +FollowSymLinks -MultiViews
    
    RewriteEngine On
    
    # to https
    RewriteCond %{HTTPS} !=on
    RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
    
    # force www at beginning
    RewriteCond %{HTTP_HOST} !^www\. [NC]
    RewriteCond %{SERVER_ADDR} !=127.0.0.1
    RewriteCond %{SERVER_ADDR} !=::1
    RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
    
    # remove trailing /
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.+)/$ /$1 [R=301,L]
    
    # remove .php (Actually, this "appends" .php, it doesn't "remove" anything)
    RewriteCond %{DOCUMENT_ROOT}/$1.php -f
    RewriteRule ^([^.]+)$ $1.php [L]
    

    I would question whether the two conditions that check against SERVER_ADDR are really necessary here, since you don't have something similar on the HTTP to HTTPS rule.

    No need to backslash-escape literal dots when used inside a regex character class.

    The rule that you have labelled "remove .php", doesn't actually "remove" anything. It appends the .php extension on requests where the .php extension has already been removed. It is better to first test that the corresponding .php file exists before attempting to rewrite the request, rather than unconditionally appending .php in the hope that the file exists (this can result in unexpected errors in some scenarios and at the very least logs the 404 on the .php request, rather than the URL that was actually requested).

    This rule block (as per your original rule block) will also result in two redirects if requesting HTTP + non-www since you are redirecting HTTP to HTTPS on the same host first. This is actually a requirement if implementing HSTS, but otherwise you can avoid this double redirect by reversing the first two rules. (However, the redirect that removes the trailing slash would also result in an additional redirect as written. This can be changed if so desired, but otherwise does not cause an immediate issue.)

    NB: Be careful with line-end comments (I've removed them). They are not supported by Apache. They might appear to work just because of the way config directives are parsed, but if you have omitted any optional arguments then you'll get a 500 Internal Server Error due to invalid syntax. (But yes, always test first with 302 - temporary - redirects.)


    UPDATE:

    everything works as it should except this: example.com/nonexistingfile.php. This causes a 404 but weirdly not my custom 404 (ErrorDocument 404 /errors/404.php) but a 'server backup 404'. Just a plain text "File not found." Any ideas on that?

    This is likely related to the way PHP is installed on your server (nothing related to the above directives). It's possible that your server is ultimately proxying all .php requests to the backend PHP engine, essentially bypassing your .htaccess/ErrorDocument directive.

    As a workaround you could try forcing a 404 using mod_rewrite for any request that contains a .php extension (since - I assume - no client should be making a direct request to a .php file).

    For example, try adding the following immediately after the RewriteEngine On directive (or after the "to https" rule to ensure the 404 response is always over HTTPS):

    # Force any "direct" request to a ".php" file to return a 404
    RewriteCond %{ENV:REDIRECT_STATUS} ^$
    RewriteRule \.php$ - [R=404]
    

    The check against the REDIRECT_STATUS env var ensures this rule only applies to direct requests from the client and not requests that have been internally rewritten on the server (by the last rule).