Search code examples
load-balancinghaproxymetallb

metallb round robin not working when accessed from external HAProxy


I have a sample app running in a kubernetes cluster with 3 replicas. I am exposing the app with type=LoadBalancer using metallb.

The external ip issued is 10.10.10.11

When I run curl 10.10.10.11 I get a different pod responding for each request as you would expect from round robin. This is the behaviour I want.

I have now setup HAProxy with a backend pointing to 10.10.10.11, however each time I access the HAProxy frontend, I get the same node responding to each request. If I keep refreshing I intermittently get different pods, sometimes after 20 refreshes, sometimes after 50+ refreshes. I have tried clearing my browser history, but that has no effect.

I assume it is my HAProxy config which is the cause the problem, perhaps caching? but I have not configured any caching. I am a HAProxy newbie, so I might be missing something.

Here is my HAProxy config.

I have tried both mode tcp and mode http, but both give the same result (the same pod responding to each request)

global
    log /dev/log    local0
    log /dev/log    local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon

    # Default SSL material locations
    ca-base /home/simon/haproxy/haproxy_certs
    crt-base /home/simon/haproxy/haproxy_certs

    # See: https://ssl-config.mozilla.org/#server=haproxy&server-version=2.0.3&config=intermediate
        ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384
        ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
        ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets

defaults
    log global
    option  httplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000
    errorfile 400 /etc/haproxy/errors/400.http
    errorfile 403 /etc/haproxy/errors/403.http
    errorfile 408 /etc/haproxy/errors/408.http
    errorfile 500 /etc/haproxy/errors/500.http
    errorfile 502 /etc/haproxy/errors/502.http
    errorfile 503 /etc/haproxy/errors/503.http
    errorfile 504 /etc/haproxy/errors/504.http

frontend https
    bind *:443 ssl crt /home/simon/haproxy/haproxy_certs
    timeout client 60s
    mode tcp

    #Hello App
    acl ACL_hello_app hdr(host) -i hello.xxxxxxxxxdomain2.com
    use_backend hello_app if ACL_hello_app

    #Nginx App
    acl ACL_nginx_app hdr(host) -i nginx.xxxxxxxxxdomain1.com
    use_backend nginx_app if ACL_nginx_app

backend hello_app
    timeout connect 10s
    timeout server 100s
    mode tcp
    server hello_app    10.10.10.11:80

backend nginx_app
    mode tcp
    server nginx_app    10.10.10.10:80

UPDATE

Upon further testing the the issue seems to be related to the timeout client, timeout connect, timeout server. I reduce these to 1 second, then I get a different POD every 1 second, however these times are so short, I also get intermittent connection failures.

So, I also have the question. Is HAProxy able to work as a reverse proxy in front of another load balancer, or do I need to use another technology such as Nginx?


Solution

  • I eventually found the answer. I needed to use option http-server-close in my frontend settings.

    frontend https
        bind *:443 ssl crt /home/simon/haproxy/haproxy_certs
        http-response set-header Strict-Transport-Security "max-age=16000000; includeSubDomains; preload;"
        timeout client 5000s
        option http-server-close
        mode http
    
        #Hello App
        acl ACL_hello_app hdr(host) -i hello.soxprox.com
        use_backend hello_app if ACL_hello_app
    
        #Nginx App
        acl ACL_nginx_app hdr(host) -i nginx.soxprox.com
        use_backend nginx_app if ACL_nginx_app
    

    With these settings I get correct round robin results from metallb