Search code examples
apache.htaccesshttp-redirect

.htaccess: Redirect and add parameter if URL contains a specific parameter


I have a URL to which external partners might be appending utm parameters (utm_source, utm_medium...) and I have no control over this. If a URL contains the utm_source parameter I need to redirect to the same URL, but appending a further parameter which contains the value of utm_source

So, if the URL contains the utm_source parameter:
https://example.com/path/index.htm?utm_source=abc&utm_medium=123

Redirect it to:
https://example.com/path/index.htm?utm_source=abc&utm_medium=123&source=abc

Note 1: the value of both utm_source and source is the same

Note 2: no redirection for:
https://example.com/path/index.htm

This is what I've tried, but it does not work:

RewriteCond %{QUERY_STRING} ^(utm_source=(.*).+)$  
RewriteRule ^(.*)$ $1%1&source=%2 [L]

How do I do this using .htaccess?

EDIT
The closest I've got is this thanks to @Don'tPanic

RewriteCond %{QUERY_STRING} ^(.*)utm_source=([^&]*)(.*)$
RewriteRule (.*)intro.htm $1intro.htm?%1utm_source=%2&source=%2%3 [R,END]

This however sends the redirect in an endless loop.
How can I execute this only once?


Solution

  • I fired up a Docker container and played around with this. The Apache docs give some tips, and some experimenting led me to a simplified version of their examples:

    RewriteEngine on
    RewriteCond %{QUERY_STRING} (.*)utm_source=([^&]*)(.*)
    RewriteRule index.php /index.php?%1utm_source=%2&source=%2%3 [END]
    

    The condition checks the query string for:

    • anything in the string before utm_source, and captures it as %1;
    • the value of the utm_string, and captures it as %2;
    • anything in the string after the utm_source value, and captures it as %3;

    Then redirects, preserving the components of the query string, and inserting source= with the value of whatever utm_source was.

    This works for me, if index.php just dumps out $_GET, for test URLs everyrhing works as expected:

    http://localhost/index.php?utm_source=stackoverflow
    Array
    (
        [utm_source] => stackoverflow
        [source] => stackoverflow
    )
    
    http://localhost/index.php?a=foo&utm_source=so
    Array
    (
        [a] => foo
        [utm_source] => so
        [source] => so
    )
    
    http://localhost/index.php?a=foo&utm_source=so&b=bar
    Array
    (
        [a] => foo
        [utm_source] => so
        [source] => so
        [b] => bar
    )
    

    EDIT

    To redirect all URLs, not just index.php (or .html), just change the rewrite rule to match anything, capture the match, and redirect to it:

    RewriteRule ^(.*)$ /$1?%1utm_source=%2&source=%2%3 [END]
    

    EDIT 2

    If you want to do an external redirect, which means you really make 2 requests to Apache and end up with a different URL in the browser, you open up a whole new can of worms:

    • Since it is a completely new request, the redirect rules are applied again, for the new request. END has no effect, because that terminates processing for the current request - it doesn't affect a new request.

    • Likewise the more commonly used L flag, it doesn't affect a new request.

    • You could try checking that source does not exist in the URL yet, meaning this is the first request ... except source appears in utm_source, so that will never be true.

    • You could try checking for a word-break character before source to match it but not utm_source ... but the _ in utm_source would also match that;

    • There is a REDIRECT_STATUS environment flag which is set after the first internal redirect ... but that's internal, and you're now doing external.

    • You can set environment flags to track status in a request ... but they won't be around in a new request.

    All I can think of is to make sure the string utm_source=..&source=... does not exist. This doesn't feel like a good solution to me as it relies on the order of the parameters in the URL, but it works:

    RewriteEngine on
    RewriteCond %{QUERY_STRING} (.*)utm_source=([^&]*)(.*)
    RewriteCond %{QUERY_STRING} !utm_source=([^&]*)&source=\1
    RewriteRule ^(.*)$ /$1?%1utm_source=%2&source=%2%3 [R,L]
    

    Questions I used to get here: