Search code examples
regex.htaccesshttp-redirectmod-rewriteregex-group

String repeats 20 times in RewriteRule if i try to use it with $1, $2 etc. back-references. Weird. Why?


I try to do a RewriteRule redirect in .htaccess where the output URL has some additional text beside the captured text from the capture groups.

Original url: https://example.com/places/europe/hungary/budapest/
My regex pattern: ^places/([a-zA-Z-_]+/)?([a-zA-Z-_]+/)?([a-zA-Z-_]+/)?

i want to add a 'text-' string in the destination url around some $1, $2..

The full line in .htaccess: RewriteRule ^places/([a-zA-Z-_]+/)?([a-zA-Z-_]+/)?([a-zA-Z-_]+/)? https://example.com/places/$1text-$2$3 [R=301,L]

but it outputs exactly this: https://example.com/places/europe/text-text-text-text-text-text-text-text-text-text-text-text-text-text-text-text-text-text-text-text-hungary/budapest/

Instead of this: https://example.com/places/europe/text-hungary/budapest/

Yep, the additional 'text-' is repeating 20x times instead of 1.

If I don't put the 'text-' in the substitution string, all works as expected i.e: https://example.com/places/$1$2$3 [R=301,L] gives https://example.com/places/europe/hungary/budapest/

What may cause this strange (to me) anomaly? Is this should work without glitches or what is the correct syntax for this case?

All the other code in the .htaccess file (positioned after this RewriteRule part in question):

# BEGIN rlrssslReallySimpleSSL rsssl_version[3.3.5]
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTPS} !=on [NC]
RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R=301,L]
</IfModule>
# END rlrssslReallySimpleSSL

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

Yes it's a wordpress install.

Thanks a lot for any help!


Solution

  • As suggested in comments by Amit that you're running into this issue because pattern [a-zA-Z_-]+/ matches original string hungary as well as the target string text-hungary, thus results in a redirect loop.

    You may use this rule with a negative lookahead to prevent a redirect loop as you're experiencing in your current rule:

    RewriteRule ^places/([a-zA-Z_-]+/)((?!text-)[a-zA-Z_-]+/)([a-zA-Z_-]+/?)$ /places/$1text-$2$3 [R=301,L]
    

    (?!text-) is a negative lookahead condition that will fail the match when $2 starts with text-.

    Also note that an unescaped - should be placed at the start or end of a character class [...].

    Make sure to clear your browser cache before testing this change.