Search code examples
regexapache.htaccesshttp-redirectmod-rewrite

Redirect using htaccess using a slash or underscore as separators


Ive searched high and low for this scenario but keep running into more simple solutions

for example

RewriteRule ^stays-the-same/[^A-Z]*[A-Z] %1 [R=301,L,NE]

The goal is to see if the url fits either of these possibilities

/stays-the-same/variable/*

/stays-the-same/variable_*

/item-1/stays-the-same/variable_*

/item-1/stays-the-same/variable_*

Regardless of case for the variable or anything that comes after it

for it to 301 to the lower case version of /item-1/stays-the-same/variable - and always to the https://www. version.

The asterisk denotes anything, for example multiple paths, numbers, underscores etc

Any suggestions are very much appreciated.

Edit

"variable" is only letters or hyphens, upper or lower case.

"item-1" is static text and should always be in the end URL , even if it wasn't there as in the first two examples.

Everything after the variable is discarded, any trailing / or _ as well.

Edit 2 - the full .htaccess file

RewriteEngine On
RewriteBase /

RewriteCond expr "tolower(%{REQUEST_URI}) =~ m#^((/[^/]+)?/[^/]+/[a-z-]+)[/_]#"
RewriteCond %{HTTP_HOST}@%1 ^(?:www\.)?(.+?)\.?@(.+)
RewriteRule ^(item-1/)?stays-the-same/([a-z-]*[A-Z][a-z-]*)[/_] https://www.%1%2 [R=301,L]

RewriteCond %{HTTP_HOST} ^212\.212\.212\.212$
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]





RewriteCond expr "tolower(%{REQUEST_URI}) =~ /(.*)/"
RewriteRule ^stays-the-same/[^A-Z]*[A-Z] %1 [R=301,L,NE]




RewriteRule \.(jpe?g|png|gif|ico|bmp|pdf|docx?|txt|css|js)$ - [L,NC]

RewriteRule ^([^\s%20]+)(?:\s|%20)+([^\s%20]+)((?:\s|%20)+.*)$ $1-$2$3 [N,DPI]
RewriteRule ^([^\s%20]+)(?:\s|%20)+(.*)$ /$1-$2 [L,R=301,DPI]




RewriteRule ^signup/ https://www.example.com/signup [R=301,L]

RewriteRule ^word/ https://www.example.com/other-word [R=301,L]

RewriteCond %{REQUEST_URI} !pagespeed


RewriteCond %{QUERY_STRING} ^PageSpeed=noscript$ [NC]
RewriteRule .* %{REQUEST_URI}? [L,R=301]

Header set Content-Language "en"




RewriteCond %{QUERY_STRING} (^|&)option\=com_flag($|&)
RewriteCond %{QUERY_STRING} (^|&)view\=inventory($|&)
RewriteCond %{QUERY_STRING} (^|&)ajax\=true($|&)
RewriteCond %{QUERY_STRING} (^|&)country($|&)
RewriteRule ^index\.php$ /stays-the-same? [L,R=301]



## No directory listings
IndexIgnore *

## Can be commented out if causes errors, see notes above.
Options +FollowSymlinks
Options -Indexes

 
## Mod_rewrite in use.


RewriteEngine On

#RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} stays-the-same
RewriteRule ^(.*)/$ /$1 [L,R=301]

RewriteCond %{HTTP_HOST} ^example.com [NC]
RewriteRule ^(.*)$ https://www.example.com/$1 [L,R=301]

## Begin - Rewrite rules to block out some common exploits.
# If you experience problems on your site then comment out the operations listed
# below by adding a # to the beginning of the line.
# This attempts to block the most common type of exploit `attempts` on Joomla!
#
# Block any script trying to base64_encode data within the URL.
RewriteCond %{QUERY_STRING} base64_encode[^(]*\([^)]*\) [OR]
# Block any script that includes a <script> tag in URL.
RewriteCond %{QUERY_STRING} (<|%3C)([^s]*s)+cript.*(>|%3E) [NC,OR]
# Block any script trying to set a PHP GLOBALS variable via URL.
RewriteCond %{QUERY_STRING} GLOBALS(=|\[|\%[0-9A-Z]{0,2}) [OR]
# Block any script trying to modify a _REQUEST variable via URL.
RewriteCond %{QUERY_STRING} _REQUEST(=|\[|\%[0-9A-Z]{0,2})
# Return 403 Forbidden header and show the content of the root homepage
RewriteRule .* index.php [F]
#
## End - Rewrite rules to block out some common exploits.



## Begin - Custom redirects

# 301 --- https://www.example.com/?ref=producthunt => https://www.example.com
RewriteCond %{QUERY_STRING} (^|&)ref\=producthunt($|&)
RewriteRule ^$ /? [L,R=301]

# 301 --- https://www.example.com/signup?view=item-2 => https://www.example.com/signup
RewriteCond %{QUERY_STRING} (^|&)view\=item-2($|&)
RewriteRule ^signup$ /signup? [L,R=301]

RewriteRule ^stay-same/(.*)$ /shoes/$1 [QSA,R=301,L]





#
# If you need to redirect some pages, or set a canonical non-www to
# www redirect (or vice versa), place that code here. Ensure those
# redirects use the correct RewriteRule syntax and the [R=301,L] flags.
#
## End - Custom redirects

##
# Uncomment the following line if your webserver's URL
# is not directly related to physical file paths.
# Update Your Joomla! Directory (just / for root).
##

# RewriteBase /

## Begin - Joomla! core SEF Section.
#
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
#
# If the requested path and file is not /index.php and the request
# has not already been internally rewritten to the index.php script
RewriteCond %{REQUEST_URI} !^/index\.php
# and the requested path and file doesn't directly match a physical file
RewriteCond %{REQUEST_FILENAME} !-f
# and the requested path and file doesn't directly match a physical folder
RewriteCond %{REQUEST_FILENAME} !-d
# internally rewrite the request to the index.php script
RewriteRule .* index.php [L]
#
## End - Joomla! core SEF Section.



php_value upload_max_filesize 5M
php_value post_max_size 10M
php_value memory_limit 320M

Solution

  • UPDATE: Several changes made to the rule/conditions to allow an all lowercase variable and to enforce the /item-1 prefix on the resulting URL-path. The part of the URL-path after the variable is discarded - something is always discarded.

    Try the following (requires Apache 2.4):

    RewriteCond expr "tolower(%{REQUEST_URI}) =~ m#^(?:/item-1)?/([^/]+/[a-z-]+)[/_]#"
    RewriteCond %{HTTP_HOST}@%1 ^(?:www\.)?(.+?)\.?@(.+)
    RewriteRule ^(item-1/)?stays-the-same/([a-z-]+)[/_] https://www.%1/item-1/%2 [NC,R=301,L]
    

    The RewriteRule pattern establishes whether the requested URL-path matches one of the possible URLs and the variable contains only letters or hyphens and at least one of those letters is uppercase. The NC flag ensures this is a case-insensitive match, so the variable can be upper/lowercase.

    The first condition then converts the URL-path to lowercase. The part of the lowercased URL-path up to and including the variable only is captured (excluding the optional /item-1 prefix), which is passed to the following condition in the %1 backreference.

    The second condition then extracts the part of the hostname, less the www. prefix (if any) which is stored in the %1 backreference (again) and passes the lowercased URL-path (from the first condition) on to the following rule in the %2 backreference.

    Note that the %2 backreference in the substitution string (that represents the URL-path) now excludes the slash prefix, so the slash is now included in the substitution string - I think this is more readable than including the slash as part of the backreference (as it was previously).

    The /item-1 prefix needs to be hardcoded in the substitution string since this is optional in the requested URL-path.


    You could simplify this and hardcode the www hostname instead and avoid the second condition. For example:

    RewriteCond expr "tolower(%{REQUEST_URI}) =~ m#^(?:/item-1)?/([^/]+/[a-z-]+)[/_]#"
    RewriteRule ^(item-1/)?stays-the-same/([a-z-]+)[/_] https://www.example.com/item-1/%1 [NC,R=301,L]
    

    This uses the backreference %1 directly from first condition in the substitution string.


    Test first with a 302 (temporary) redirect to avoid potential caching issues and make sure the browser cache is cleared before testing.