Search code examples
apache.htaccessmod-rewrite

Log image filename that's cached by external cdn using htaccess


I want to keep a log of image file names whenever a specific cdn caches our images but I can't quite get it. Right now, my code looks something like:

RewriteCond %{HTTP_USER_AGENT} Photon/1.0
RewriteRule ^(.*)$ log.php?image=$1 [L]

The above always logs the image as being "log.php" even if I'm making the cdn cache "example.jpg" and I thoroughly don't understand why.


Solution

  • The above always logs the image as being "log.php" even if I'm making the cdn cache "example.jpg" and I thoroughly don't understand why.

    Because in .htaccess the rewrite engine loops until the URL passes through unchanged (despite the presence of the L flag) and your rule also matches log.php (your rule matches everything) - so this is the "image" that is ultimately logged. The L flag simply stops the current pass through the rewrite engine.

    For example:

    1. Request /example.jpg
    2. Request is rewritten to log.php?image=example.jpg
    3. Rewrite engine starts over, passing /log.php?image=example.jpg to the start of the second pass.
    4. Request is rewritten to log.php?image=log.php by the same RewriteRule directive.
    5. Rewrite engine starts over, passing /log.php?image=log.php to the start of the third pass.
    6. Request is rewritten to log.php?image=log.php (again).
    7. URL has not changed in the last pass - processing stops.

    You need to make an exception so that log.php itself is not processed. Or, state that all non-.php files are processed (instead of everything). Or, if only images are meant to be processed then only check for images.

    For example:

    # Log images only
    RewriteCond %{HTTP_USER_AGENT} Photon/1\.0
    RewriteRule ^(.+\.(?:png|jpg|webp|gif))$ log.php?image=$1 [L]
    

    Remember to backslash-escape literal dots in the regex.

    Or,

    # Log Everything except log.php itself
    RewriteCond %{HTTP_USER_AGENT} Photon/1\.0
    RewriteCond %{REQUEST_URI} ^/(.+)
    RewriteRule !^log\.php$ log.php?image=%1 [L]
    

    In the last example, %1 refers to the captured subpattern in the preceding CondPattern. I only did it this way, rather than using REQUEST_URI directly since you are excluding the slash prefix in your original logging directive (ie. you are passing image.jpg to your script when /image.jpg is requested). If you want to log the slash prefix as well, then you can omit the 2nd condition and pass REQUEST_URI directly. For example:

    # Log Everything except log.php itself (include slash prefix)
    RewriteCond %{HTTP_USER_AGENT} Photon/1.0
    RewriteRule !^log\.php$ log.php?image=%{REQUEST_URI} [L]
    

    Alternatively, on Apache 2.4+ you can use the END flag instead of L to force the rewrite engine to stop and prevent further passes through the rewrite engine. For example:

    RewriteCond %{HTTP_USER_AGENT} Photon/1\.0
    RewriteRule (.+) log.php?image=$1 [END]