Something wrong happend with my RPi 4 cluster based on k3sup.
Everything works as expected until yesterday when I had to reinstall master node operating system. For example, I have a redis installed on master node and then some pods on worker nodes. My pods can not connect to redis via DNS: redis-master.database.svc.cluster.local
(but they do day before).
It throws an error that can not resolve domain when I test with busybox like:
kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- nslookup redis-master.database.svc.cluster.local
When I want to ping my service with IP (also on busybox):
kubectl run -it --rm --restart=Never busybox --image=busybox:1.28 -- ping 10.43.115.159
It shows that 100% packet loss.
I'm able to resolve issue with DNS by simply replace coredns config (replace line with forward . /etc/resolv.conf
to forward . 192.168.1.101
) but I don't think that's good solution, as earlier I didn't have to do that.
Also, it solves issue for mapping domain to IP, but still connection via IP doesn't work.
My nodes:
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
node-4 Ready <none> 10h v1.19.15+k3s2 192.168.1.105 <none> Debian GNU/Linux 10 (buster) 5.10.60-v8+ containerd://1.4.11-k3s1
node-3 Ready <none> 10h v1.19.15+k3s2 192.168.1.104 <none> Debian GNU/Linux 10 (buster) 5.10.60-v8+ containerd://1.4.11-k3s1
node-1 Ready <none> 10h v1.19.15+k3s2 192.168.1.102 <none> Debian GNU/Linux 10 (buster) 5.10.60-v8+ containerd://1.4.11-k3s1
node-0 Ready master 10h v1.19.15+k3s2 192.168.1.101 <none> Debian GNU/Linux 10 (buster) 5.10.63-v8+ containerd://1.4.11-k3s1
node-2 Ready <none> 10h v1.19.15+k3s2 192.168.1.103 <none> Debian GNU/Linux 10 (buster) 5.10.60-v8+ containerd://1.4.11-k3s1
Master node has a taint: role=master:NoSchedule
.
Any ideas?
UPDATE 1
I'm able to connect into redis pod. /etc/resolv.conf from redis-master-0
search database.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.43.0.10
options ndots:5
All services on kubernetes:
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 6d9h
kube-system traefik-prometheus ClusterIP 10.43.94.137 <none> 9100/TCP 6d8h
registry proxy-docker-registry ClusterIP 10.43.16.139 <none> 5000/TCP 6d8h
kube-system kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 6d9h
kube-system metrics-server ClusterIP 10.43.101.30 <none> 443/TCP 6d9h
database redis-headless ClusterIP None <none> 6379/TCP 5d19h
database redis-master ClusterIP 10.43.115.159 <none> 6379/TCP 5d19h
kube-system traefik LoadBalancer 10.43.221.89 192.168.1.102,192.168.1.103,192.168.1.104,192.168.1.105 80:30446/TCP,443:32443/TCP 6d8h
There was one more thing that was not mentioned. I'm using OpenVPN with NordVPN server list on master node, and use a privoxy for worker nodes.
When you install and run OpenVPN before running kubernetes master, OpenVPN add rules that block kubernetes networking. So, coredns does not work and you can't reach any pod via IP as well.
I'm using RPi 4 cluster, so for me it was good enough to just re-install master node, install kubernetes at first and then configure openvpn. Now everything is working as expected.
It's good enough to order your system units by adding After
or Before
in service definition. I have VPN systemd service that looks like below:
[Unit]
Description=Enable VPN for System
After=network.target
After=k3s.service
[Service]
Type=simple
ExecStart=/etc/openvpn/start-nordvpn-server.sh
[Install]
WantedBy=multi-user.target
It guarantee that VPN will be run after kubernetes.