Search code examples
html.htaccessurl-rewriting

Htaccess rewrite html,htm,php to each other to help transition pages to the same file extension


I've currently got the following htaccess items which swaps html and htm file extensions back and forth, so if you try to load index.html but the only file that exists is index.htm it will serve that instead. It works vice versa too.

Goal is to move everything to PHP but in the meantime is it possible to extend this to cover php as well. So if one of the older html pages calls index.htm or index.html, it would find they don't exist and serve index.php instead. Likewise if you type index.php and it doesn't exist it would serve either the htm or html file.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}/$1\.html -f [NC]
RewriteRule ^(.+?)(?:\.(?:htm))?$ /$1.html [L,NC,R=302]

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{DOCUMENT_ROOT}/$1\.htm -f [NC]
RewriteRule ^(.+?)(?:\.(?:html))?$ /$1.htm [L,NC,R=302]

Similar to .htaccess: rewrite .htm urls internally to .php, but also redirect .php urls to .htm but a little more complicated.


Solution

  • RewriteCond %{DOCUMENT_ROOT}/$1\.html -f [NC]
    

    The NC flag is not supported when using -f. Whilst this isn't an "error" (the flag is simply ignored), your error log is likely to be littered with warnings.

    There is also no need to backslash-escape the literal dot in the TestString (1st argument to the RewriteCond directive). This is evaluated as an ordinary string, not a regex.

    RewriteRule ^(.+?)(?:\.(?:htm))?$ /$1.html [L,NC,R=302]
    

    Since you've made the file-extension optional in the RewriteRule pattern, the regex matches everything, so you will end up testing everything, not just URLs that end in .htm (in this example). eg. Request /foo.htm and the above tests whether /foo.html exists (good), but request /foo.php and it tests if /foo.php.html exists (unnecessary).

    You should be checking for a specific extension in each rule.

    You are wanting to check every file extension, without prioritising any. It would be preferable (simpler, more efficient and arguably better SEO) to not use any file extension on the request and to prioritise file extensions that you want to serve. eg. Request /foo and serve .php if it exists, otherwise .html, otherwise .htm. Anyway, that's not what you are asking here.

    The solution is similar to what you have already done, you just need to be methodical and test each combination. You can also use an optimisation and skip all the checks if the request already maps to an existing file.

    Try the following:

    # If the request already maps to a file then skip the following "5" rules
    RewriteCond %{REQUEST_FILENAME} -f
    RewriteRule ^ - [S=5]
    
    # ----------
    # Request .php, test .html
    RewriteCond %{DOCUMENT_ROOT}/$1.html -f
    RewriteRule ^(.+)\.php$ /$1.html [NC,R=302,L]
    
    # Request .php, test .htm
    RewriteCond %{DOCUMENT_ROOT}/$1.htm -f
    RewriteRule ^(.+)\.php$ /$1.htm [NC,R=302,L]
    
    # ----------
    # Request .html (or .htm), test .php
    RewriteCond %{DOCUMENT_ROOT}/$1.php -f
    RewriteRule ^(.+)\.html?$ /$1.php [NC,R=302,L]
    
    # Request .html, test .htm
    RewriteCond %{DOCUMENT_ROOT}/$1\.htm -f
    RewriteRule ^(.+)\.html$ /$1.htm [NC,R=302,L]
    
    # ----------
    # Request .htm, test .html
    RewriteCond %{DOCUMENT_ROOT}/$1\.html -f
    RewriteRule ^(.+)\.htm$ /$1.html [NC,R=302,L]