We have a 2 node K3S cluster with one master and one worker node and would like "reasonable availability" in that, if one or the other nodes goes down the cluster still works i.e. ingress reaches the services and pods which we have replicated across both nodes. We have an external load balancer (F5) which does active health checks on each node and only sends traffic to up nodes.
Unfortunately, if the master goes down the worker will not serve any traffic (ingress).
This is strange because all the service pods (which ingress feeds) on the worker node are running.
We suspect the reason is that key services such as the traefik
ingress controller and coredns
are only running on the master.
Indeed when we simulated a master failure, restoring it from a backup, none of the pods on the worker could do any DNS resolution. Only a reboot of the worker solved this.
We've tried to increase the number of replicas of the traefik
and coredns
deployment which helps a bit BUT:
We would appreciate some advice and explanation:
traefik
and coredns
be DaemonSets by default?UPDATE: Ingress Description:
kubectl describe ingress -n msa
Name: msa-ingress
Namespace: msa
Address: 10.3.229.111,10.3.229.112
Default backend: default-http-backend:80 (<error: endpoints "default-http-backend" not found>)
TLS:
tls-secret terminates service.ourdomain.com,node1.ourdomain.com,node2.ourdomain.com
Rules:
Host Path Backends
---- ---- --------
service.ourdomain.com
/ gateway:8443 (10.42.0.100:8443,10.42.1.115:8443)
node1.ourdomain.com
/ gateway:8443 (10.42.0.100:8443,10.42.1.115:8443)
node2.ourdomain.com
/ gateway:8443 (10.42.0.100:8443,10.42.1.115:8443)
Annotations: kubernetes.io/ingress.class: traefik
traefik.ingress.kubernetes.io/router.middlewares: msa-middleware@kubernetescrd
Events: <none>
Your goals seems can be achievable with a few K8S internal features (not specific to Traffic):
Assure you have 1 replica of Ingress Controller's Pod on each Node => use Daemon Set as a installation method
To fix the error from Ingress Description set the correct load Balancer IP of Ingress Controller's Service.
Use external Traffic Policy to "Local" - this assures that traffic is routed to local endpoints only (Controller Pads running on Node accepting traffic from Load Balancer)
externalTrafficPolicy
- denotes if this Service desires to route external traffic to node-local or cluster-wide endpoints. There are two available options:Cluster
(default) andLocal
.Cluster
obscures the client source IP and may cause a second hop to another node, but should have good overall load-spreading.Local
preserves the client source IP and avoids a second hop for LoadBalancer and NodePort type Services, but risks potentially imbalanced traffic spreading.
apiVersion: v1
kind: Service
metadata:
name: example-service
spec:
selector:
app: example
ports:
- port: 8765
targetPort: 9376
externalTrafficPolicy: Local
type: LoadBalancer
externalTrafficPolicy: Local
too.