For reasons that aren't worth going into here, Google has been indexing one of my sites with unnecessary query strings in the URL which are wordfence_lh
, hid
and wordfence_logHuman
. I'd like to modify my .htaccess
file to remove all those query strings.
My URLs
example.com/page/111/?wordfence_lh=1&hid=CA2BA660BEFF26B9A17F8F85D7391BD4
example.com/page/80/?wordfence_logHuman=1&hid=647700EBF43600E7BC54103256F1D71B
Expected URLs
example.com/page/111/
example.com/page/80/
I've found a way to remove a single parameter, but I still can't find a regex or something to remove multiple query parameters. Any help is greatly appreciated, thanks so much!
Here's a part of my .htaccess
file:
RewriteEngine On
RewriteBase /
RewriteCond %{HTTPS} on [OR]
RewriteCond %{SERVER_PORT} ^555$ [OR]
RewriteCond %{HTTP:X-Forwarded-Proto} https
RewriteRule .* - [E=WPR_SSL:-https]
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule .* - [E=WPR_ENC:_gzip]
RewriteCond %{REQUEST_METHOD} GET
RewriteCond %{QUERY_STRING} =""
RewriteCond %{HTTP:Cookie} !(wordpress_logged_in_.+|wp-postpass_|wptouch_switch_toggle|comment_author_|comment_author_email_) [NC]
RewriteCond %{REQUEST_URI} !^(/(.+/)?feed/?.+/?|/(?:.+/)?embed/|/(index\.php/)?wp\-json(/.*|$)|/cantonicalt/)$ [NC]
RewriteCond %{HTTP_USER_AGENT} !^(facebookexternalhit).* [NC]
RewriteCond "%{DOCUMENT_ROOT}/wp-content/cache/wp-rocket/%{HTTP_HOST}%{REQUEST_URI}/index%{ENV:WPR_SSL}%{ENV:WPR_WEBP}.html%{ENV:WPR_ENC}" -f
RewriteRule .* "/wp-content/cache/wp-rocket/%{HTTP_HOST}%{REQUEST_URI}/index%{ENV:WPR_SSL}%{ENV:WPR_WEBP}.html%{ENV:WPR_ENC}" [L]
</IfModule>
I did not see any other than 3 URL parameters wordfence_lh, hid and wordfence_logHuman. I want to remove them
If you don't have any other URL parameters on any other URLs then it would be simplest to to just remove the entire query string if any query string is present. For example:
# Remove any query string on all URLs
RewriteCond %{QUERY_STRING} .
RewriteRule ^ %{REQUEST_URI} [QSD,R=301,L]
This needs to go at the top of the .htaccess
file, before your existing directives.
The RewriteCond
directive checks for the presence of any query string. The QSD
flag discards the query string from the redirect response.
However, if you have other URL parameters on other URLs, that need to be preserved then check for these specific URL parameters (as first suggested) and then remove the entire query string if any of these URL parameters are present. For example:
# Remove the entire query string if any one of the URL params are present
RewriteCond %{QUERY_STRING} (&|^)(wordfence_lh|hid|wordfence_logHuman)=
RewriteRule ^ %{REQUEST_URI} [QSD,R=301,L]
But I still don't want to interfere with other measurement tools like google analytics.
This isn't a problem unless you are using URL parameters on other URLs and these are sometimes mixed with the URL parameters you want to remove?
UPDATE:
Recently I have just tested with... Is it the same with your 2nd code? What is the difference?
RewriteCond %{QUERY_STRING} ^(.*)&?wordfence_lh=[^&]+&?(.*)$ [NC] RewriteRule ^/?(.*)$ /$1?%1%2 [R=301,L] RewriteCond %{QUERY_STRING} ^(.*)&?wordfence_logHuman=[^&]+&?(.*)$ [NC] RewriteRule ^/?(.*)$ /$1?%1%2 [R=301,L] RewriteCond %{QUERY_STRING} ^(.*)&?hid=[^&]+&?(.*)$ [NC] RewriteRule ^/?(.*)$ /$1?%1%2 [R=301,L]
No, it's not "the same". It is "attempting" to preserve URL parameters that are mixed with the URL parameters you are wanting to remove (as mentioned in my last sentence above) - which does not appear to be a requirement for you.
However, there are a couple of issues with these directives:
It is matching too much and could potentially corrupt the query string. For example, it doesn't just match hid=
, it would also match foohid=
and will then preserve the foo
part which would potentially "break" the query string. eg. Given a query string like foohid=123&bar=1
, the above directive would redirect to foobar=1
which is obviously not correct.
This series of 3 rules potentially triggers 3 external redirects, since a separate redirect is triggered for each occurrence of a URL parameter you want to remove. This should (and can) be avoided. In your example URLs (that contain just two of these URL params), you would get two redirects. Two redirects isn't necessarily too bad, however, it could be reduced to a single redirect (worst case).