I'm running HAProxy as a TCP loadbalancer in front of an on-prem Kubernetes cluster. I have set up a small app on each cluster node which return HTTP200 when the node is considered healthy. One of the healthchecks it performs is to query the KubeAPI and verify the status according to K8S itself. Now, if for some reason the Kube API goes down, all nodes will be considered unhealthy at the same time, even though the applications running on the workers are still available.
I'd like to set up HAProxy in such a way that whenever all worker nodes are down according to the health check, HAProxy just assumes they are all alive. If indeed all nodes are down, whether or not traffic is forwarded doesn't matter. If the reason they're all down is that some shared component doesn't respond, just blindly sending traffic will at least keep the service going.
I've parsed the HAProxy reference in search of an option which does this. I can't seem to find one. I think I should be able to get this functionality by registering each worker node twice, once regularly and once with the backup
option specified. Then adding allbackups
to the backend would make it so that if all worker nodes are down, alls worker nodes are used as a backup. That would look like this:
backend workers
mode tcp
option httpchk HEAD /
option allbackups
server worker-001-1 <address-1> check port 32000
server worker-001-2 <address-2> check port 32000
server worker-001-1-backup <address-1> backup
server worker-001-2-backup <address-2> backup
While this solution seems to work. It seems very hacky. Is there any way to do this in a cleaner way. Is there an option I missed in the reference?
Thanks!
I found a more suitable solution in this answer: https://serverfault.com/a/923624/255500
It boils down to using backend switching rules and creating two backends for each group of clusters:
frontend ingress
bind *:80 http
bind *:443 https
bind *:30000-32676 nodeports
mode tcp
default_backend workers
use_backend workers_backup if { nbsrv(workers) eq 0 }
backend workers
mode tcp
option httpchk HEAD /
server worker-001-1 <address-1> check port 32000
server worker-001-2 <address-2> check port 32000
backend workers_backup
mode tcp
server worker-001-1 <address-1> no-check
server worker-001-2 <address-2> no-check
Once backend workers
has zero servers up, backend workers_backup
will be used. It's still registering each node twice, but I think this is the better solution.