Search code examples
kuberneteshaproxykubernetes-podcontainerd

HAproxy static pod runs 100% CPU on node on containerd


I am working on Deprecating Docker shim and using containerd as container engine in kubelet. It caused haproxy Static pod to run 100% CPU on node. The resource limits doesn't work for static pod so that did not work. The same pod when run as a container in Podman, works fine and it worked fine when docker was the container engine for K8s. Any way to troubleshoot what's causing the issue as kubelet, containerd and haproxy are involved?

HA-Proxy version 2.0.25-6986403 2021/09/07 - https://haproxy.org/
Build options :
  TARGET  = linux-glibc
  CPU     = generic
  CC      = gcc
  CFLAGS  = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement -fwrapv -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-old-style-declaration -Wno-ignored-qualifiers -Wno-clobbered -Wno-missing-field-initializers -Wno-implicit-fallthrough -Wno-stringop-overflow -Wno-cast-function-type -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference
  OPTIONS = USE_PCRE2=1 USE_PCRE2_JIT=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_LUA=1 USE_ZLIB=1

Feature list : +EPOLL -KQUEUE -MY_EPOLL -MY_SPLICE +NETFILTER -PCRE -PCRE_JIT +PCRE2 +PCRE2_JIT +POLL -PRIVATE_CACHE +THREAD -PTHREAD_PSHARED -REGPARM -STATIC_PCRE -STATIC_PCRE2 +TPROXY +LINUX_TPROXY +LINUX_SPLICE +LIBCRYPT +CRYPT_H -VSYSCALL +GETADDRINFO +OPENSSL +LUA +FUTEX +ACCEPT4 -CLOSEFROM -MY_ACCEPT4 +ZLIB -SLZ +CPU_AFFINITY +TFO +NS +DL +RT -DEVICEATLAS -51DEGREES -WURFL -SYSTEMD -OBSOLETE_LINKER +PRCTL +THREAD_DUMP -EVPORTS

Default settings :
  bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with multi-threading support (MAX_THREADS=64, default=8).
Built with OpenSSL version : OpenSSL 1.1.1d  10 Sep 2019
Running on OpenSSL version : OpenSSL 1.1.1d  10 Sep 2019
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
Built with Lua version : Lua 5.3.3
Built with network namespace support.
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with PCRE2 version : 10.32 2018-09-10
PCRE2 library supports JIT : yes
Encrypted password support via crypt(3): yes
Built with the Prometheus exporter as a service

Available polling systems :
      epoll : pref=300,  test result OK
       poll : pref=200,  test result OK
     select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
              h2 : mode=HTX        side=FE|BE     mux=H2
              h2 : mode=HTTP       side=FE        mux=H2
       <default> : mode=HTX        side=FE|BE     mux=H1
       <default> : mode=TCP|HTTP   side=FE|BE     mux=PASS

Available services :
    prometheus-exporter

Available filters :
    [SPOE] spoe
    [COMP] compression
    [CACHE] cache
    [TRACE] trace

Here is the static pod spec

apiVersion: v1
kind: Pod
metadata:
  name: static-web
  labels:
    role: myrole
spec:
  containers:
    - name: web
      image: haproxy:v2.0.0
      - mountPath: /usr/local/etc/haproxy/haproxy.cfg
        name: configfile
      ports:
        - name: web
          containerPort: 6443
          protocol: TCP
  volumes:
  - name: configfile
    hostPath:
      path: /etc/kubernetes/config/haproxy.cfg
      type: FileOrCreate

Solution

  • The issue is fixed. When maxconnections is not declared in global params, haproxy automatically uses max open file kernel parameter and in my case it was 536870882. This is causing 100% CPU and 95% memory consumption. Now added maxconn 10000 in global and the issue is fixed.

    global
      maxconn 10000