Search code examples
kubernetesflannel

Kubernetes Flannel k8s_install-cni_kube-flannel-ds exited on worker node


I am setting up my very first Kubernetes cluster. We are expecting to have mix of Windows and Linux node so I picked flannel as my cni. I am using RHEL 7.7 as my master node and I have two other RHEL 7.7 machines as worker node and then rest are Windows Server 2019. For most of the part, I was following documentation provided on Microsoft site: https://learn.microsoft.com/en-us/virtualization/windowscontainers/kubernetes/getting-started-kubernetes-windows and also one on Kubernetes site: https://kubernetes.cn/docs/tasks/administer-cluster/kubeadm/adding-windows-nodes/ . I know article on Microsoft site is more than 2 years old but this is only the guide I found for mixed mode operations.

I have done following so far on Master and worker RHEL nodes:

  1. stopped and disabled firewalld
  2. disabled selinux
  3. update && upgrade
  4. Disabled swap partition
  5. Added /etc/hosts entry for all nodes involved in my Kubernetes cluster
  6. Installed Docker CE 19.03.11
  7. Install kubectl, kubeadm and kubelet 1.18.3 (Build date 2020-05-20)
  8. Prepare Kubernetes control plane for Flannel: sudo sysctl net.bridge.bridge-nf-call-iptables=1

I have now done following on RHEL Master node

Initialize cluster

kubeadm init --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12

kubectl as non-root user

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Patch the Daemon set for the node selector

wget https://raw.githubusercontent.com/Microsoft/SDN/master/Kubernetes/flannel/l2bridge/manifests/node-selector-patch.yml
kubectl patch ds/kube-proxy --patch "$(cat node-selector-patch.yml)" -n=kube-system

After the patch, kube-proxy looks like this:

kube-system ds

Add Flannel

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Modify the net-conf.json section of the flannel manifest in order to set the VNI to 4096 and the Port to 4789. It should look as follows:

net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan",
        "VNI" : 4096,
        "Port": 4789
      }
    }

Apply modified kube-flannel

kubectl apply -f kube-flannel.yml

After adding network, here is what I get for pods in kube-system enter image description here

Add Windows Flannel and kube-proxy DaemonSets

curl -L https://github.com/kubernetes-sigs/sig-windows-tools/releases/latest/download/kube-proxy.yml | sed 's/VERSION/v1.18.0/g' | kubectl apply -f -
kubectl apply -f https://github.com/kubernetes-sigs/sig-windows-tools/releases/latest/download/flannel-overlay.yml

Join Worker node I am now trying to join the RHEL 7.7 worker node by executing the kubeadm join command generated when IU initialized my cluster. Worker node initializes fine as seen below: worker node initialized

when I go to my RHEL worker node, I see that k8s_install-cni_kube-flannel-ds-amd64-f4mtp_kube-system container is exited as seen below: cni not workinge

  1. Can you please let me know if I am following the correct procedure? I believe Flannel CNI is required to talk to pods within kubernetes cluster
  2. If Flannel is difficult to setup for mixed mode, can we use other network which can work?
  3. If we decide to go only and only RHEL nodes, what is the best and easiest network plugin I can install without going through lot of issues?

Thanks and I appreciate it.


Solution

  • There are a lot of materials about Kubernetes on official site and I encourage you to check it out:

    I divided this answer on parts:

    • CNI
    • Troubleshooting

    CNI

    What is CNI?

    CNI (Container Network Interface), a Cloud Native Computing Foundation project, consists of a specification and libraries for writing plugins to configure network interfaces in Linux containers, along with a number of supported plugins. CNI concerns itself only with network connectivity of containers and removing allocated resources when the container is deleted. Because of this focus, CNI has a wide range of support and the specification is simple to implement.

    -- Github.com: Containernetworking: CNI

    Your CNI plugin in simple terms is responsible for pod's networking inside your cluster.

    There are multiple CNI plugins like:

    • Flannel
    • Calico
    • Multus
    • Weavenet

    What I mean about that, you don't need to use Flannel. You can use other plugin like Calico. The major consideration is that they are different from each other and you should pick option best for your use case (support for some feature for example).

    There are a lot of materials/resources on this topic. Please take a look at some of them:

    As for:

    If Flannel is difficult to setup for mixed mode, can we use other network which can work?

    If you mean mixed mode by using nodes that are Windows and Linux machines, I would stick to guides that are already written like one you mentioned: Kubernetes.io: Adding Windows nodes

    As for:

    If we decide to go only and only RHEL nodes, what is the best and easiest network plugin I can install without going through lot of issues?

    The best way to choose CNI plugin would entail looking for solution fitting your needs the most. You can follow this link for an overview:

    Also you can look here (Please have in mind that this article is from 2018 and could be outdated):


    Troubleshooting

    when I go to my RHEL worker node, I see that k8s_install-cni_kube-flannel-ds-amd64-f4mtp_kube-system container is exited as seen below:

    Your k8s_install-cni_kube-flannel-ds-amd64-f4mtp_kube-system container exited with status 0 which should indicate correct provisioning.

    You can check the logs of flannel pods by invoking below command:

    • kubectl logs POD_NAME

    You can also refer to official documentation of Flannel: Github.com: Flannel: Troubleshooting

    As I said in the comment:

    To check if your CNI is working you can spawn 2 pods on 2 different nodes and try make a connection between them (like ping them).

    Steps:

    • Spawn pods
    • Check their IP addresses
    • Exec into pods
    • Ping

    Spawn pods

    Below is example deployment definition that will spawn ubuntu pods. They will be used to check if pods have communication between nodes:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: ubuntu
    spec:
      selector:
        matchLabels:
          app: ubuntu
      replicas: 5 
      template: 
        metadata:
          labels:
            app: ubuntu
        spec:
          containers:
          - name: ubuntu
            image: ubuntu:latest
            command:
            - sleep
            - infinity
    

    Please have in mind that this example is for testing purposes only. Apply above definition with:

    • kubectl apply -f FILE_NAME.yaml

    Check their IP addresses

    After pods spawned you should be able to run command:

    • $ kubectl get pods -o wide

    and see output similar to this:

    NAME                      READY   STATUS    RESTARTS   AGE   IP          NODE                                         NOMINATED NODE   READINESS GATES
    ubuntu-557dc88445-lngt7   1/1     Running   0          8s    10.20.0.4   NODE-1   <none>           <none>
    ubuntu-557dc88445-nhvbw   1/1     Running   0          8s    10.20.0.5   NODE-1   <none>           <none>
    ubuntu-557dc88445-p8v86   1/1     Running   0          8s    10.20.2.4   NODE-2   <none>           <none>
    ubuntu-557dc88445-vm2kg   1/1     Running   0          8s    10.20.1.9   NODE-3   <none>           <none>
    ubuntu-557dc88445-xwt86   1/1     Running   0          8s    10.20.0.3   NODE-1   <none>           <none>
    

    You can see from above output:

    • what IP address each pod has
    • what node is assigned to each pod.

    By above example we will try to make a connection between:

    • ubuntu-557dc88445-lngt7 (first one) with ip address of 10.20.0.4 on the NODE-1
    • ubuntu-557dc88445-p8v86 (third one) with ip address of 10.20.2.4 on the NODE-2

    Exec into pods

    You can exec into the pod to run commands:

    • $ kubectl exec -it ubuntu-557dc88445-lngt7 -- /bin/bash

    Please take a look on official documentation here: Kubernetes.io: Get shell running container

    Ping

    Ping was not built into the ubuntu image but you can install it with:

    • $ apt update && apt install iputils-ping

    After that you can ping the 2nd pod and check if you can connect to another pod:

    root@ubuntu-557dc88445-lngt7:/# ping 10.20.2.4 -c 4
    PING 10.20.2.4 (10.20.2.4) 56(84) bytes of data.
    64 bytes from 10.20.2.4: icmp_seq=1 ttl=62 time=0.168 ms
    64 bytes from 10.20.2.4: icmp_seq=2 ttl=62 time=0.169 ms
    64 bytes from 10.20.2.4: icmp_seq=3 ttl=62 time=0.174 ms
    64 bytes from 10.20.2.4: icmp_seq=4 ttl=62 time=0.206 ms
    
    --- 10.20.2.4 ping statistics ---
    4 packets transmitted, 4 received, 0% packet loss, time 3104ms
    rtt min/avg/max/mdev = 0.168/0.179/0.206/0.015 ms