Search code examples
.htaccessmod-rewriteurlencode

Mod Rewrite URL Containing Encoded URL


I have a website with a few external links. They all use target="_blank" to open in a new tab/window. I want to log clicks on these links, by using an internal link which logs the click, then redirects to the requested external url.

I use a simple (jQuery) function to rewrite the href of the links as follows:

    $("a[target=_blank]").each(function() {
        const href = $(this).href();
        $(this).href("out/" + encodeURIComponent(href));
    });

($.fn.href() is defined elsewhere).

This results in links with hrefs like:

https://example.com/out/https%3A%2F%2Fwww.chartjs.org

So far so good.

Now I need my .htaccess to rewrite that to:

index.php?token=out&action=https%3A%2F%2Fwww.chartjs.org

How do I achieve this?

I've tried everything I can think of, including:

RewriteRule ^out/(.*)$ /index.php?token=out&action=$1 [L]

and

RewriteRule ^out/([^/]+)$ index.php?token=out&action=$1 [L]

but can't get anything to work.

(I've also tried using [L,B] and [L,QSA] to no avail).

Any help would be much appreciated!


Solution

  • https://example.com/out/https%3A%2F%2Fwww.chartjs.org
    

    A URL like this, with encoded slashes (ie. %2F) in the URL-path part of the URL will, by default, result in a 404 response before your directives in .htaccess are able to process the request. This is a "security" feature.

    However, you can override this behaviour and permit encoded slashes. BUT, this can only be configured in the server-config, not .htaccess. For example:

    # In the <VirtualHost> or main server config
    AllowEncodedSlashes NoDecode
    

    Valid values are NoDecode (preferable) and On. (Off being the default.)

    You can then use mod_rewrite in .htaccess to rewrite the request. For example:

    RewriteEngine On
    
    RewriteRule ^(out)/(.*) index.php?token=$1&action=$2 [B,L]
    

    Note that the URL-path matched by the RewriteRule directive is %-decoded, so you will need to use the B flag here to URL-encode the captured backreference.

    Although, you could just rewrite to index.php (without explicitly passing the parameters) and instead parse the requested URL (ie. $_SERVER['REQUEST_URI']) in PHP instead to extract the necessary information.


    Alternatively, consider removing (or manually encoding) the http(s):// prefix prior to building your URL, to avoid // (or %2F%2F) from appearing in the resulting URL to begin with. (Double slashes in the URL-path are also problematic when it comes to mod_rewrite.) Also you would still potentially have issues with slashes that occur later in the URL (although they don't necessarily need to be URL-encoded). You could even base64 encode the resulting URL (using btoa() in JavaScript) and later base64_decode() in PHP.