Search code examples
kubernetesnginx-ingresscoredns

Kubernetes Nginx Ingress controller Readiness Probe failed


I am trying to setup my very first Kubernetes cluster and it seems to have setup fine until nginx-ingress controller. Here is my cluster information: Nodes: three RHEL7 and one RHEL8 nodes Master is running on RHEL7 Kubernetes server version: 1.19.1 Networking used: flannel coredns is running fine. selinux and firewall are disabled on all nodes

Here are my all pods running in kube-system pods running in kube-system

I then followed instructions on following page to install nginx ingress controller: https://docs.nginx.com/nginx-ingress-controller/installation/installation-with-manifests/

Instead of deployment, I decided to use daemon-set since I am going to have only few nodes running in my kubernetes cluster.

After following the instructions, pod on my RHEL8 is constantly failing with the following error:

Readiness probe failed: Get "http://10.244.3.2:8081/nginx-ready": dial tcp 10.244.3.2:8081: connect: connection refused Back-off restarting failed container

Here is the screenshot shows that RHEL7 pods are working just fine and RHEL8 is failing: running and failing pods

All nodes are setup exactly the same way and there is no difference. I am very new to Kubernetes and don't know much internals of it. Can someone please point me on how can I debug and fix this issue? I am really willing to learn from issues like this.

This is how I provisioned RHEL7 and RHEL8 nodes

  1. Installed docker version: 19.03.12, build 48a66213fe
  2. Disabled firewalld
  3. Disabled swap
  4. Disabled SELinux
  5. To enable iptables to see bridged traffic, set net.bridge.bridge-nf-call-ip6tables = 1 and net.bridge.bridge-nf-call-iptables = 1
  6. Added hosts entry for all the nodes involved in Kubernetes cluster so that they can find each other without hitting DNS
  7. Added IP address of all nodes in Kubernetes cluster on /etc/environment for no_proxy so that it doesn't hit corporate proxy
  8. Verified docker driver to be "systemd" and NOT "cgroupfs"
  9. Reboot server
  10. Install kubectl, kubeadm, kubelet as per kubernetes guide here at: https://kubernetes.io/docs/tasks/tools/install-kubectl/
  11. Start and enable kubelet service
  12. Initialize master by executing the following:
kubeadm init --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12
  1. Apply node-selector patch for mixed OS scheduling
wget https://raw.githubusercontent.com/Microsoft/SDN/master/Kubernetes/flannel/l2bridge/manifests/node-selector-patch.yml
kubectl patch ds/kube-proxy --patch "$(cat node-selector-patch.yml)" -n=kube-system
  1. Apply flannel CNI
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Modify net-conf.json section of kube-flannel.yml for a type "host-gw"

kubectl apply -f kube-flannel.yml

Apply node selector patch

kubectl patch ds/kube-flannel-ds-amd64 --patch "$(cat node-selector-patch.yml)" -n=kube-system

Thanks


Solution

  • According to kubernetes documentation the list of supported host operating systems is as follows:

    • Ubuntu 16.04+
    • Debian 9+
    • CentOS 7
    • Red Hat Enterprise Linux (RHEL) 7
    • Fedora 25+
    • HypriotOS v1.0.1+
    • Flatcar Container Linux (tested with 2512.3.0)

    This article mentioned that there are network issues on RHEL 8:

    (2020/02/11 Update: After installation, I keep facing pod network issue which is like deployed pod is unable to reach external network or pods deployed in different workers are unable to ping each other even I can see all nodes (master, worker1 and worker2) are ready via kubectl get nodes. After checking through the Kubernetes.io official website, I observed the nfstables backend is not compatible with the current kubeadm packages. Please refer the following link in “Ensure iptables tooling does not use the nfstables backend”.

    The simplest solution here is to reinstall the node on supported operating system.