Search code examples
varnishvarnish-vclfastlyfastly-vcl

VCL return(lookup)


We are using Fastly and its Varnish to deliver content from our services. To distribute the content amongst several services, we are using following snippet:

    sub vcl_recv {
      #FASTLY recv
      if (req.url.path ~ "^/services/") {
        set req.url = regsub(req.url, "/services/(.*?)/", "/");
      }
    }

This works and allows us to deliver /services/user/get to /get endpoint of the user service.

However using this Snippet makes Fastly completely skip gzip compression. This is fixable with using return(lookup):

    sub vcl_recv {
      #FASTLY recv
      if (req.url.path ~ "^/services/") {
        set req.url = regsub(req.url, "/services/(.*?)/", "/");
      }
      return (lookup);
    }

At this point the gzip compression is working. Unfortunately this makes all POST, PATCH, DELETE requests arrive as GET.

I tried to study the Varnish docs and I am not sure whether (lookup) is truly the field I need. Can you lead me to how this should be implemented?


Solution

  • Built-in VCL

    Varnish uses the built-in VCL to create a set of common rules. They serve as a safety net to the end user.

    See https://github.com/varnishcache/varnish-cache/blob/master/bin/varnishd/builtin.vcl for the built-in VCL.

    This code and this file should not be loaded by you, but are executed automatically when you don't perform an explicit return statement in one of the subroutines.

    Any explicit return statement will bypass the default behavior. Sometimes this is required to customize the behavior of Varnish. But sometimes it is counterproductive and causes undesired behavior.

    The consequences of your return(lookup) statement

    The sub vcl_recv {} subroutine, which is responsible for handling incoming requests, does an unconditional return(lookup).

    This means every single request results in a cache lookup, even if the built-in VCL would rather pass these requests directly to the backend.

    There are 2 strategies to decide on caching:

    • Define rules what can be cached and perform a return(pass) on all other requests
    • Define rules on what cannot be cached and perform a return(lookup) on all other requests

    Basically it's a blacklist/whitelist kind of thing.

    GET & HEAD vs the other request methods

    The built-in VCL only allows GET and HEAD requests to be cached. Other request methods, like for example a POST, implies that stat changes will occur. That's why they are not cached.

    If you try performing a return(lookup) for a POST call, Varnish will internally change this request to a GET.

    There are ways to cache POST calls, but in general you should not do that.

    How you should structure your sub vcl_recv {}

    I would advise you to remove the return(lookup) statement from your sub vcl_recv {} subroutine.

    As explained, the built-in VCL will take over as soon as you exited your custom sub vcl_recv {}.

    However, the built-in VCL will not be super helpful, because your website probably some cookies in place.

    It's important to strip off cookies in a sensible way, and keep them for requests that require cookies. For those pages, a return(pass) will be inserted to ensure that these personalized requests don't get looked up in cache, but are directly passed to the backend.

    What about gzip?

    It is possible to figure out why gzip stopped working. The varnishlog tool allows you to introspect a running system and filter out logs.

    See https://feryn.eu/blog/varnishlog-measure-varnish-cache-performance/ for an extensive blog post I wrote about this topic.

    Maybe varnishlog can help you find the reason why gzip compression stopped working at some point.