Search code examples
restapikubernetesinternal-server-errormetallb

Kubernetes cluster REST API error: 500 internal server error


I have a k8s cluster deployed using kubespray. The loadbalancer used is metalLB. I have deployed a helm chart in this cluster which has a REST service up at an address 10.0.8.26:50028

I am sending requests to this service:

http://10.0.8.26:50028/data/v3/authentication

http://10.0.8.26:50028/data/v3/actions

http://10.0.8.26:50028/data/v3/versions

But each time I call an endpoints, it returns responses in an order:

503 transport is closing

500 Internal server

500 Internal server

204 - correct response

The same order is returned when i call each endpoint. Once a correct response is returned, after that there are no errors. But trying a new endoint will return error.

Can someone please help me?


Solution

  • This error was related to the connections between the services in the cluster. The cluster was using a kube-proxy in IPVS mode. Due to the IPVS timeouts (in he nodes), the connection between the services gets terminated after 900 seconds:

    $ ipvsadm -l --timeout    
    Timeout (tcp tcpfin udp): 900 120 300 
    

    That means the tcp connection were terminated by another agent. My application uses both grpc protocol for the communication between some services. So, after setting grpc keepalive in the application's code and tcp keepalive of pods to a lower value, the issue was resolved.

    The following links may provide more details:

    https://success.docker.com/article/ipvs-connection-timeout-issue

    https://github.com/moby/moby/issues/31208

    https://github.com/kubernetes/kubernetes/issues/80298