I need help understanding in detail how an ingress controller, specifically the ingress-nginx ingress controller, is supposed to work. To me, it appears as a black box that is supposed to listen on a public IP, terminate TLS, and forward traffic to a pod. But exactly how that happens is a mystery to me.
The primary goal here is understanding, the secondary goal is troubleshooting an immediate issue I'm facing.
I have a cluster with five nodes, and am trying to get the Jupyterhub application to run on it. For the most part, it is working fine. I'm using a pretty standard Rancher RKE setup with flannel/calico for the networking. The nodes run RedHat 7.9 with iptables and firewalld, and docker 19.03.
The Jupyterhub proxy is set up with a ClusterIP service (I also tried a NodePort service, that also works). I also set up an ingress. The ingress sometimes works, but oftentimes does not respond (connection times out). Specifically, if I delete the ingress, and then redeploy my helm chart, the ingress will start working. Also, if I restart one of my nodes, the ingress will start working again. I have not identified the circumstances when the ingress stops working.
Here are my relevant services:
kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
hub ClusterIP 10.32.0.183 <none> 8081/TCP 378d
proxy-api ClusterIP 10.32.0.11 <none> 8001/TCP 378d
proxy-public ClusterIP 10.32.0.30 <none> 80/TCP 378d
This works; telnet 10.32.0.30 80 responds as expected (of course only from one of the nodes). I can also telnet directly to the proxy-public pod (10.244.4.41:8000 in my case).
Here is my ingress.
kubectl describe ingress
Name: jupyterhub
Labels: app=jupyterhub
app.kubernetes.io/managed-by=Helm
chart=jupyterhub-1.2.0
component=ingress
heritage=Helm
release=jhub
Namespace: jhub
Address: k8s-node4.<redacted>,k8s-node5.<redacted>
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
TLS:
tls-jhub terminates jupyterhub.<redacted>
Rules:
Host Path Backends
---- ---- --------
jupyterhub.<redacted>
/ proxy-public:http (10.244.4.41:8000)
Annotations: field.cattle.io/publicEndpoints:
[{"addresses":["",""],"port":443,"protocol":"HTTPS","serviceName":"jhub:proxy-public","ingressName":"jhub:jupyterhub","hostname":"jupyterh...
meta.helm.sh/release-name: jhub
meta.helm.sh/release-namespace: jhub
Events: <none>
What I understand so far about the ingress in this situation:
Traffic arrives on port 443 at k8s-node4 or k8s-node5. Some magic (controlled by the ingress controller) receives that traffic, terminates TLS, and sends the unencrypted traffic to the pod's IP at port 8000. That's the part I want to understand better.
That black box seems to at least partially involve flanel/calico and some iptables magic, and it also obviously involves nginx at some point.
Update: in the meantime, I identified what causes Kubernetes to break: restarting firewalld.
As best I can tell, that wipes out all iptables rules, not just the firewalld-generated ones.
I found the answer to my question here: https://www.stackrox.io/blog/kubernetes-networking-demystified/ There probably is a caveat that this may vary to some extent depending on which networking CNI you are using, although everything I saw was strictly related to Kubernetes itself.
I'm still trying to digest the content of this blog, and I highly recommend referring directly to that blog, instead of relying on my answer, which could be a poor retelling of the story.
Here is approximately how a package that arrives on port 443 flows.
You will need to use the command to see the tables.
iptables -t nat -vnL | less
The output of this looks rather intimidating.
The below cuts out a lot of other chains and calls to cut to the chase. In this example:
In that situation, here is how the packet flows:
There is some additional complexity involved if the pod is on a different node than the packet arrived in, and also if multiple pods are load-balanced for the same port. Load balancing seems to be handled with the iptables statistics module randomly picking one or the other iptables rule.
Internal traffic from a service to a pod follows a similar flow, but not the same.
In this example:
Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
...
KUBE-SERVICES all -- * * 0.0.0.0/0 0.0.0.0/0
Chain KUBE-SERVICES (2 references)
...
/* Traffic from within the cluster to 10.32.0.183:8001 */
0 0 KUBE-SVC-ZHCKOT5PFJF4PASJ tcp -- * * 0.0.0.0/0 10.32.0.183 tcp dpt:8001
...
/* Mark the package */
Chain KUBE-SVC-ZHCKOT5PFJF4PASJ (1 references)
pkts bytes target prot opt in out source destination
0 0 KUBE-MARK-MASQ tcp -- * * !10.244.0.0/16 10.32.0.183 tcp dpt:8081
0 0 KUBE-SEP-RYU73S2VFHOHW4XO all -- * * 0.0.0.0/0 0.0.0.0/0
/* Perform DNAT, redirecting from 10.32.0.183 to 10.244.6.12 */
Chain KUBE-SEP-RYU73S2VFHOHW4XO (1 references) 0 0 KUBE-MARK-MASQ all -- * * 10.244.6.112 0.0.0.0/0
0 0 DNAT tcp -- * * 0.0.0.0/0 0.0.0.0/0 tcp to:10.244.6.112:8081
The second part of my question regarding how to get the nodes to work reliably: