How to detect backend liveness with Varnish and Nginx

I have a Varnish server that sits in front on an Nginx server that sits in front of a web service (in truth they are all in k8s and have multiple replicas but that seems irrelevant to the question).

Within Varnish, I need to detect the backend status so that I can control the length of grace period and serve stale data accordingly. The problem is that the backend for Varnish is Nginx, and as such the healthcheck is for Nginx itself and not for the underlying service.

The problem this causes of course, is that if the actual service goes down, Nginx is still working, and Varnish thinks all is well with the world and refuses to up the grace period accordingly.

Any ideas on how to get out of this conundrum? I've been trying to make nginx expose the healthecks of the underlying service but to no avail.

Thanks

Solution

It is possible to define an extra backend for your origin web application, and health probe it.

Via std.healthy() you can figure out whether or not it is healthy, and adjust grace accordingly.

Here's a VCL example:

vcl 4.1;
import std;

probe health {
    .url = "/healthz";
}

backend proxy {
    .host = "proxy.example.com";
    .port = "80";
    .probe = health;
}
backend webapp {
    .host = "webapp.example.com";
    .port = "80";
    .probe = health;
}

sub vcl_recv {
    if(std.healthy(webapp)){
        set req.grace= 10s;
    } 
}

sub vcl_backend_response {
     set beresp.grace = 24h;
}

This example will set a standard grace of 24 hours, but if the backend is healthy, the grace it enforces is only 10 seconds.

The proxy backend is selected by default, because it is defined first. The webapp backend is only there for polling.