I'm new to this site. If someone could please help I will be forever grateful.
I already have this code in my .htaccess file:
RewriteEngine On
RewriteCond %{THE_REQUEST} \s/([^.]+)\.html [NC]
RewriteRule ^ /%1 [R=301,L]
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^([0-9a-zA-Z_-]+)$ $1.html [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^.*$ / [R=301,L]
RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC]
RewriteRule ^(.*)$ https://%1/$1 [L,R=301]
What this does is if you type in "website.com/page.html" it will change the URL to "website.com/page" and it also redirects any random page URL that is not a page to the homepage.
How would I be able to make it so that if you type any case variation it will point to the page file? For example: type in "website.com/pAgE" it will redirect to "website.com/page" and not redirect to the homepage. Plus if some of my code is silly to do the other stuff please tell me.
Thanks
I tried to add:
RewriteCond %{REQUEST_URI} \[A-Z\]
RewriteRule ^(.\*)$ /${tolower:$1} \[R=301,L\]
but it seems to break the site.
I tried to add: RewriteCond %{REQUEST_URI} \[A-Z\] RewriteRule ^(.\*)$ /${tolower:$1} \[R=301,L\]
I assume those backslashes are typos (an attempt at formatting?) in your question and not part of your actual code?!
The tolower
rewritemap is only available if you have already configured this in the server config, which I assume you have not done?
However, on Apache 2.4+ there is a tolower
function that you can use directly in .htaccess
, so the rewritemap is not required (as it would be on earlier versions of Apache).
RewriteCond %{THE_REQUEST} \s/([^.]+)\.html [NC] RewriteRule ^ /%1 [R=301,L] RewriteCond %{REQUEST_FILENAME}.html -f RewriteRule ^([0-9a-zA-Z_-]+)$ $1.html [L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^.*$ / [R=301,L] RewriteCond %{HTTP_HOST} ^www\.(.*)$ [NC] RewriteRule ^(.*)$ https://%1/$1 [L,R=301]
However, there are other issues with your existing code.
The rules are in the wrong order. External redirects should be before internal rewrites. Otherwise you can end up redirecting (exposing) to the internally rewritten URL.
These rules will result in a redirect loop under certain conditions. REQUEST_FILENAME
is not necessarily the same as the URL you are rewriting to, so you are testing one thing and potentially rewriting to something different.
Why are you redirecting 404s to the homepage?! That is bad for users and SEO (and development, since you don't know what URLs are being requested that result in a 404 - so you can't fix it!?) However, this is more easily achieved using an ErrorDocument
directive, but you shouldn't be doing this to begin with. Create a custom 404 page that communicates this to the user and allows the user to navigate to the homepage (and other pages) if they wish (give them an incentive to do so).
Why remove the .html
extension when this doesn't map to a physical file? You end up with the wrong URL logged in your access log (which should be a 404).
If all your filenames are lowercase then you could simply convert (ie. "redirect") everything (providing it contains an uppercase letter) to the lowercase'd URL regardless of whether it is a "page file" or not. Note that this does not strictly make your URLs "case-insensitive" (which is arguably bad for SEO), it is simply an uppercase to lowercase redirect.
Try the following instead:
RewriteEngine On
ErrorDocument 404 /error-docs/my-custom-404.html
# www to non-www canonical redirect
RewriteCond %{HTTP_HOST} ^www\.(.+) [NC]
RewriteRule ^(.*)$ https://%1/$1 [R=301,L]
# Convert all URLs to lowercase
RewriteCond expr "tolower(%{REQUEST_URI}) =~ /(.*)/"
RewriteRule [A-Z] %1 [R=301,L]
# Remove ".html" extension only if this maps to a real file (in root only)
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^([\w-]+)\.html$ /$1 [R=301,L]
# Append ".html" if this matches a real file (in root only)
RewriteCond %{DOCUMENT_ROOT}/$1.html -f
RewriteRule ^([\w-]+)$ $1.html [L]
And create a /error-docs/my-custom-404.html
file with your friendly "404 Not Found" custom error page with links to the homepage and elsewhere. (Taking this a step further, you can analyse the URL, check for typos etc. and suggest pages that the user perhaps intended to visit, etc.)
The use of the REDIRECT_STATUS
environment variable is a cleaner method (IMO) to avoid a redirect loop and errors in the regex than using THE_REQUEST
. (Your existing regex was not correct since it would potentially match the query string as well, resulting in malformed redirects.)
Based off your original rules, this only works for files in the document root, not subdirectories. I assume this is intentional.
The regex character class [\w-]
uses the \w
shorthand character class and is the same as the more verbose [0-9a-zA-Z_-]
.
Make sure you clear your browser cache before testing and test first with 302 (temporary) redirects to avoid potential caching issues.