Search code examples
kuberneteskopsaws-auto-scaling

cluster-autoscaler deployment fails with "1 Too many pods, 3 node(s) didn't match Pod's node affinity/selector"


I have created a k8s cluster with kops (1.21.4) on AWS and as per the docs on autoscaler. I have done the required changes to my cluster but when the cluster starts, the cluster-autoscaler pod is unable to schedule on any node. When I describe the pod, I see the following:

Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  4m31s (x92 over 98m)  default-scheduler  0/4 nodes are available: 1 Too many pods, 3 node(s) didn't match Pod's node affinity/selector.

Looking at the deployment for cluster I see the following podAntiAffinity:

      affinity:                                                                 
        podAntiAffinity:                                                        
          preferredDuringSchedulingIgnoredDuringExecution:                      
          - podAffinityTerm:                                                    
              labelSelector:                                                    
                matchExpressions:                                               
                - key: app                                                      
                  operator: In                                                  
                  values:                                                       
                  - cluster-autoscaler                                          
              topologyKey: topology.kubernetes.io/zone                          
            weight: 100                                                         
          requiredDuringSchedulingIgnoredDuringExecution:                       
          - labelSelector:                                                      
              matchExpressions:                                                 
              - key: app                                                        
                operator: In                                                    
                values:                                                         
                - cluster-autoscaler                                            
            topologyKey: kubernetes.com/hostname

From this I understand that it want to prevent running pod on same node which already has cluster-autoscaler running. But that doesn't seem to justify the error seen in the pod status.

Edit: The pod for autoscaler has the following nodeSelectors and tolerations:

Node-Selectors:              node-role.kubernetes.io/master=
Tolerations:                 node-role.kubernetes.io/master op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s

So clearly, it should be able to schedule on master node too.

I am not sure what else do I need to do to make the pod up and running.


Solution

  • Posting the answer out of comments.


    There are podAffinity rules in place so first thing to check is if any errors in scheduling are presented. Which is the case:

    0/4 nodes are available: 1 Too many pods, 3 node(s) didn't match Pod's node affinity/selector.
    

    Since there are 1 control plane (on which pod is supposed to be scheduled) and 3 worked nodes, that leads to the error 1 Too many pods related to the control plane.


    Since cluster is running in AWS, there's a known limitation about amount of network interfaces and private IP addresses per machine type - IP addresses per network interface per instance type.

    t3.small was used which has 3 interfaces and 4 IPs per interface = 12 in total which was not enough.

    Scaling up to t3.medium resolved the issue.


    Credits to Jonas's answer about the root cause.