Search code examples
kubernetesmicrok8s

Why are pods still running after microk8s has been stopped?


I'm in learning mode here, so please forgive me if this is a stupid question...

I have just installed microk8s on ubuntu following the instructions at https://ubuntu.com/tutorials/install-a-local-kubernetes-with-microk8s

Everything works. The "microbot" application gets deployed and exposed and creates a simple web server. But what surprised me is that after I stop microk8s (with "microk8s stop"), the web server is still apparently up and running. It continues to respond to curl with its simple page content.

Is this expected behavior? Do the pods continue to run after the orchestrator has stopped?

Also, I was trying to figure out what microk8s is doing with the network. It fires up its dashboard on 10.152.183.203, but when I look at the interfaces and routing tables on my host, I can't figure out how traffic is being routed to that destination. And if I run tcpdump I can't seem to capture any of the traffic being sent to that address.

Any explanation of what's going on here would be much appreciated!

  • Duncan

Solution

  • But what surprised me is that after I stop microk8s (with "microk8s stop"), the web server is still apparently up and running. It continues to respond to curl with its simple page content.

    Is this expected behavior? Do the pods continue to run after the orchestrator has stopped?

    That's not expected behavior, and I can't reproduce it. What I see is that the service is available for a period of several seconds after running microk8s stop, but eventually everything gets shut down.

    Also, I was trying to figure out what microk8s is doing with the network. It fires up its dashboard on 10.152.183.203, but when I look at the interfaces and routing tables on my host, I can't figure out how traffic is being routed to that destination.

    I've deployed Microk8s locally, and the dashboard Service looks like this:

    root@ubuntu:~# kubectl -n kube-system get service kubernetes-dashboard
    NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
    kubernetes-dashboard   ClusterIP   10.152.183.151   <none>        443/TCP   14m
    

    As you note in your question, I can access this at http://10.152.183.151, but there are no local interfaces on that network:

    root@ubuntu:~# ip addr |grep 10.152
    <no results>
    

    And there are no meaningful routes to that network. E.g., this shows that access to that ip would be via the default gateway, which doesn't make any sense:

    root@ubuntu:~# ip route get 10.152.183.151
    10.152.183.151 via 192.168.122.1 dev enp1s0 src 192.168.122.72 uid 0
        cache
    

    What's going on? It turns out that microk8s sets up a bunch of NAT rules in your local firewall configuration. If we look for the dashboard address in the NAT table, we find:

    root@ubuntu:~# iptables-legacy -t nat -S | grep 10.152.183.151
    -A KUBE-SERVICES -d 10.152.183.151/32 -p tcp -m comment --comment "kube-system/kubernetes-dashboard cluster IP" -m tcp --dport 443 -j KUBE-SVC-4HQ2X6RJ753IMQ2F
    -A KUBE-SVC-4HQ2X6RJ753IMQ2F ! -s 10.1.0.0/16 -d 10.152.183.151/32 -p tcp -m comment --comment "kube-system/kubernetes-dashboard cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
    

    If we follow the chain, we find that:

    1. Packets going to 10.152.183.151 enter the NAT PREROUTING chain, which sends them to the KUBE-SERVICES chain.

    2. In the KUBE-SERVICES chain, packets to the dashboard (for tcp port 443) are sent to the KUBE-SVC-4HQ2X6RJ753IMQ2F chain.

    3. In the KUBE-SVC-4HQ2X6RJ753IMQ2F, packets are first sent to the KUBE-MARK-MASQ chain, which sets a mark on the packet (which is used elsewhere in the configuration), and then gets sent to the KUBE-SEP-SZAWMA3BPGJYVHOD chain:

      root@ubuntu:~# iptables-legacy -t nat -S KUBE-SVC-4HQ2X6RJ753IMQ2F
      -A KUBE-SVC-4HQ2X6RJ753IMQ2F ! -s 10.1.0.0/16 -d 10.152.183.151/32 -p tcp -m comment --comment "kube-system/kubernetes-dashboard cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ
      -A KUBE-SVC-4HQ2X6RJ753IMQ2F -m comment --comment "kube-system/kubernetes-dashboard -> 10.1.243.198:8443" -j KUBE-SEP-SZAWMA3BPGJYVHOD
      
    4. In the KUBE-SEP-SZAWMA3BPGJYVHOD chain, the packets finally hit a DNAT rule that maps the connection to the IP of the pod:

      root@ubuntu:~# iptables-legacy -t nat -S KUBE-SEP-SZAWMA3BPGJYVHOD
      -N KUBE-SEP-SZAWMA3BPGJYVHOD
      -A KUBE-SEP-SZAWMA3BPGJYVHOD -s 10.1.243.198/32 -m comment --comment "kube-system/kubernetes-dashboard" -j KUBE-MARK-MASQ
      -A KUBE-SEP-SZAWMA3BPGJYVHOD -p tcp -m comment --comment "kube-system/kubernetes-dashboard" -m tcp -j DNAT --to-destination 10.1.243.198:8443
      

      We know that 10.1.243.198 is the Pod IP because we can see it like this:

      kubectl -n kube-system get pod kubernetes-dashboard-74b66d7f9c-plj8f -o jsonpath='{.status.podIP}'
      

    So, we can reach the dashboard at 10.152.183.151 because the PREROUTING chain ultimately hits a DNAT rule that maps the "clusterip" of the service to the pod ip.

    And if I run tcpdump I can't seem to capture any of the traffic being sent to that address.

    Based on the above discussion, if we use the pod ip instead, we'll see the traffic we expect. The following shows the result of me running curl -k https://10.152.183.151 in another window:

    root@ubuntu:~# tcpdump -n -i any -c10 host 10.1.243.198
    tcpdump: data link type LINUX_SLL2
    tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
    listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
    19:20:33.441481 cali7ef1137a66d Out IP 192.168.122.72.33034 > 10.1.243.198.8443: Flags [S], seq 228813760, win 64240, options [mss 1460,sackOK,TS val 3747829344 ecr 0,nop,wscale 7], length 0
    19:20:33.441494 cali7ef1137a66d In  IP 10.1.243.198.8443 > 192.168.122.72.33034: Flags [S.], seq 3905988324, ack 228813761, win 65160, options [mss 1460,sackOK,TS val 1344719721 ecr 3747829344,nop,wscale 7], length 0
    19:20:33.441506 cali7ef1137a66d Out IP 192.168.122.72.33034 > 10.1.243.198.8443: Flags [.], ack 1, win 502, options [nop,nop,TS val 3747829344 ecr 1344719721], length 0
    19:20:33.442754 cali7ef1137a66d Out IP 192.168.122.72.33034 > 10.1.243.198.8443: Flags [P.], seq 1:518, ack 1, win 502, options [nop,nop,TS val 3747829345 ecr 1344719721], length 517
    19:20:33.442763 cali7ef1137a66d In  IP 10.1.243.198.8443 > 192.168.122.72.33034: Flags [.], ack 518, win 506, options [nop,nop,TS val 1344719722 ecr 3747829345], length 0
    19:20:33.443004 cali7ef1137a66d In  IP 10.1.243.198.8443 > 192.168.122.72.33034: Flags [P.], seq 1:772, ack 518, win 506, options [nop,nop,TS val 1344719722 ecr 3747829345], length 771
    19:20:33.443017 cali7ef1137a66d Out IP 192.168.122.72.33034 > 10.1.243.198.8443: Flags [.], ack 772, win 501, options [nop,nop,TS val 3747829345 ecr 1344719722], length 0
    19:20:33.443677 cali7ef1137a66d Out IP 192.168.122.72.33034 > 10.1.243.198.8443: Flags [P.], seq 518:582, ack 772, win 501, options [nop,nop,TS val 3747829346 ecr 1344719722], length 64
    19:20:33.443680 cali7ef1137a66d In  IP 10.1.243.198.8443 > 192.168.122.72.33034: Flags [.], ack 582, win 506, options [nop,nop,TS val 1344719723 ecr 3747829346], length 0
    19:20:33.443749 cali7ef1137a66d In  IP 10.1.243.198.8443 > 192.168.122.72.33034: Flags [P.], seq 772:827, ack 582, win 506, options [nop,nop,TS val 1344719723 ecr 3747829346], length 55
    10 packets captured
    38 packets received by filter
    0 packets dropped by kernel