ceph-mds pod fails to launch with Insufficient cpu, MatchNodeSelector, PodToleratesNodeTaints

I tracked down the CPU usage. Even after increasing the number of nodes I still get a persistent scheduling error with the following terms: Insufficient cpu, MatchNodeSelector, PodToleratesNodeTaints.

Solution

My hint came from this article. It mentions:

Do not allow new pods to schedule onto the node unless they tolerate the taint, but allow all pods submitted to Kubelet without going through the scheduler to start, and allow all already-running pods to continue running. Enforced by the scheduler.

The configuration contains the following.

spec:
  replicas: 1
  template:
    metadata:
      name: ceph-mds
      namespace: ceph
      labels:
        app: ceph
        daemon: mds
    spec:
      nodeSelector:
        node-type: storage
    ... and more ...

Notice the node-type. I have to kubectl label nodes node-type=storage --all so I can label all nodes with node-type=storage. I could also choose to only dedicate some nodes as storage nodes.

In kops edit ig nodes, according to this hint, you can add this label in the following.

spec:
  nodeLabels:
    node-type: storage