Search code examples
apache.htaccessmod-rewrite

Serve index.php if index.html is not present in directory without a trailing slash


I have an HTML website with subdirectories.

What I need is:

  1. Access subdirectories with no trailing slash ie. (example.com/blog) not (example.com/blog/)
  2. If index.html is absent under certain subdir, it should look for index.php instead.

/blog/ page is a WordPress page with its own .htaccess.

Here's my current root .htaccess:

RewriteEngine on
ErrorDocument 404 /error/error404.html


DirectorySlash Off
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI}/index.html -f
RewriteRule ^(.*)$ /$1/index.html [L]

Wordpress .htaccess at /blog/.htaccess:

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
RewriteBase /blog/
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /blog/index.php [L]
</IfModule>

(This is autogenerated by Wordpress)

As for now, html files can be access under certain subdir but not with index.php. What should I do?


Solution

  • The DirectoryIndex document lists the files to try to serve when requesting a directory. You don't need mod_rewrite for this.

    However, to use the DirectoryIndex when you set DirectorySlash Off you need to manually append the trailing to the directory with an internal rewrite, otherwise, a 403 will be triggered even when that directory contains a DirectoryIndex document. (It is for this reason that mod_autoindex needs to be disabled to prevent accidental disclosure of information - the presence of a DirectoryIndex document is not sufficient to prevent the auto-generated directory listings when DirectorySlash Off is set.)

    For example:

    # Disable directory listings (mod_autoindex)
    Options -Indexes
    
    ErrorDocument 404 /error/error404.html
    
    # Look for index.html then index.php in the directory being requested
    DirectoryIndex index.html index.php
    
    # Prevent mod_dir appending the trailing slash
    DirectorySlash Off
    
    RewriteEngine On
    
    # If request a directory without a trailing slash then rewrite to append it
    # This allows DirectoryIndex to work as intended
    # - exclude the document root
    RewriteCond $1 !/$
    RewriteCond %{REQUEST_FILENAME} -d
    RewriteRule (.+) $1/ [L]
    

    Make sure the browser cache is cleared. mod_dir issues a 301 (permanent) redirect to append the trailing slash.


    UPDATE#1:

    /blog/ page is a WordPress page with its own .htaccess.

    The problem with this is that when the rewrite engine is enabled in the subdirectory then it completely overrides the mod_rewrite directives in the parent config, so the slash is not appended and the directory index is not triggered (and the mod_rewrite directives are not processed because of the missing slash). (This feels a bit chicken and egg.)

    The only way I have successfully resolved this is to move the WordPress directives into the parent config (making the necessary changes) and remove /blog/.htaccess altogether.

    For example, append the following modified WordPress directives after the directives above:

    # WordPress in "/blog" subdirectory
    RewriteRule ^ - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
    RewriteRule ^blog/index\.php$ - [L]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^blog/. /blog/index.php [L]
    

    UPDATE#2:

    have URL with trailing slash to redirect to URL without trailing slash? like for example, if I visit /about/ it should redirect to /about?

    Yes, we can do this. Since we are removing the trailing slash from everything, including directories, this is essentially unconditional. The only caveat is that the rule to remove the trailing slash must only apply to direct requests from the client and not rewritten requests by the later rewrite, that appends the trailing slash to directories, which would otherwise result in a redirect loop.

    Aside: You obviously need to ensure that all your internal links do not include the trailing slash, otherwise, the user will experience an external redirect (bad for SEO and your server).

    We can do this by checking against the REDIRECT_STATUS environment variable, which is empty on the initial request, and set to 200 (as in 200 OK status) after the first successful rewrite (that appends the trailing slash to directories).

    For example, the following should go immediately after the RewriteEngine directive, before the existing rewrite:

    # Redirect direct requests to remove trailing slash from all URLs
    RewriteCond %{ENV:REDIRECT_STATUS} ^$
    RewriteRule (.+)/$ /$1 [R=302,L]
    

    Test first with a 302 (temporary) redirect and change to a 301 (permanent) redirect once you have confirmed it works as intended. This is to avoid potential caching issues.

    Summary

    With all the directives in place:

    # Disable directory listings (mod_autoindex)
    Options -Indexes
    
    ErrorDocument 404 /error/error404.html
    
    # Look for index.html then index.php in the directory being requested
    DirectoryIndex index.html index.php
    
    # Prevent mod_dir appending the trailing slash
    DirectorySlash Off
    
    RewriteEngine On
    
    # Redirect direct requests to remove trailing slash from all URLs
    RewriteCond %{ENV:REDIRECT_STATUS} ^$
    RewriteRule (.+)/$ /$1 [R=301,L]
    
    # If request a directory without a trailing slash then rewrite to append it
    # - This allows DirectoryIndex to work as intended
    # - Exclude the document root
    RewriteCond $1 !/$
    RewriteCond %{REQUEST_FILENAME} -d
    RewriteRule (.+) $1/ [L]
    
    # WordPress in "/blog" subdirectory
    RewriteRule ^ - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
    RewriteRule ^blog/index\.php$ - [L]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^blog/. /blog/index.php [L]