Search code examples
cachingurl-rewritingvarnishvarnish-vcl

Ignore utm_* values with varnish?


Can I 'ignore' query string variables before pulling matching objects from the cache, but not actually remove them from the URL to the end-user?

For example, all the marketing utm_source, utm_campaign, utm_* values don't change the content of the page, they just vary a lot from campaign to campaign and are used by all of our client-side tracking.

So this also means that the URL can't change on the client side, but it should somehow be 'normalized' in the cache.

Essentially I want all of these...

http://example.com/page/?utm_source=google

http://example.com/page/?utm_source=facebook&utm_content=123

http://example.com/page/?utm_campaign=usa

... to all access HIT the cache for http://example.com/page/

However, this URL would cause a MISS (because the param is not a utm_* param)

http://example.com/page/?utm_source=google&variation=5

Would trigger the cache for

http://example.com/page/?variation=5

Also, keeping in mind that the URL the user sees must remain the same, I can't redirect to something without params or any kind of solution like that.


Solution

  • This did the trick... it's not perfect according to my own question though as it ignores ALL query params, not just utm ones. When I need to actually implement a non-utm value which changes the content I will need to revisit this regex:

    sub vcl_recv {
        set req.url = regsub(req.url, "\?.*", "");
    }