Search code examples
kuberneteskubernetes-helmamazon-eks

Pods Pending for Node-Exporter via Helm on EKS


for the purposes of troubleshooting I decided to deploy a very vanilla implementation of Prometheus NodeExporter via helm install exporter stable/prometheus however I can't get the pods to start. I've searched high and low and I'm not sure where else to turn. I'm able to install many other apps on my cluster with the exception of just this one. I've attached some troubleshooting output for your reference. I believe it may have something to do with "tolerations" but I'm still digging in.

EKS cluster is running on 3 t2.large which can support up to 35 pods per node, and I'm running a total of 43 pods. Any other ideas for troubleshooting would be greatly appreciated.

Describe Pods Output

✗ kubectl get pods
NAME                                                              READY   STATUS             RESTARTS   AGE
exporter-prometheus-node-exporter-bcwc4                           0/1     Pending            0          15m
exporter-prometheus-node-exporter-kr7z7                           0/1     Pending            0          15m
exporter-prometheus-node-exporter-lw87g                           0/1     Pending            0          15m

Describe Pods

Name:           exporter-prometheus-node-exporter-bcwc4
Namespace:      monitoring
Priority:       0
Node:           <none>
Labels:         app=prometheus
                chart=prometheus-11.1.2
                component=node-exporter
                controller-revision-hash=668b4894bb
                heritage=Helm
                pod-template-generation=1
                release=exporter
Annotations:    kubernetes.io/psp: eks.privileged
Status:         Pending
IP:
IPs:            <none>
Controlled By:  DaemonSet/exporter-prometheus-node-exporter
Containers:
  prometheus-node-exporter:
    Image:      prom/node-exporter:v0.18.1
    Port:       9100/TCP
    Host Port:  9100/TCP
    Args:
      --path.procfs=/host/proc
      --path.sysfs=/host/sys
    Environment:  <none>
    Mounts:
      /host/proc from proc (ro)
      /host/sys from sys (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from exporter-prometheus-node-exporter-token-rl4fm (ro)
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  proc:
    Type:          HostPath (bare host directory volume)
    Path:          /proc
    HostPathType:
  sys:
    Type:          HostPath (bare host directory volume)
    Path:          /sys
    HostPathType:
  exporter-prometheus-node-exporter-token-rl4fm:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  exporter-prometheus-node-exporter-token-rl4fm
    Optional:    false
QoS Class:       BestEffort

Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  2s (x24 over 29m)  default-scheduler  0/3 nodes are available: 2 node(s) didn't match node selector, 3 node(s) didn't
 have free ports for the requested pod ports.

Daemoneset Config

apiVersion: extensions/v1beta1                                                                                                                      
kind: DaemonSet
metadata:
  creationTimestamp: "2020-05-12T06:15:30Z"
  generation: 1
  labels:
    app: prometheus
    chart: prometheus-11.1.2
    component: node-exporter
    heritage: Helm
    release: exporter
  name: exporter-prometheus-node-exporter
  namespace: monitoring
  resourceVersion: "8131959"
  selfLink: /apis/extensions/v1beta1/namespaces/monitoring/daemonsets/exporter-prometheus-node-exporter
  uid: 5ede0739-cd05-4e3b-ace1-87fafb33314a
spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: prometheus
      component: node-exporter
      release: exporter
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: prometheus
        chart: prometheus-11.1.2
        component: node-exporter
        heritage: Helm
        release: exporter
    spec:
      containers:
      - args:
        - --path.procfs=/host/proc
        - --path.sysfs=/host/sys
        image: prom/node-exporter:v0.18.1
        imagePullPolicy: IfNotPresent
        name: prometheus-node-exporter
        ports:
        - containerPort: 9100
          hostPort: 9100
          name: metrics
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /host/proc
          name: proc
        readOnly: true
        - mountPath: /host/sys
          name: sys
          readOnly: true
      dnsPolicy: ClusterFirst
      hostNetwork: true
      hostPID: true
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: exporter-prometheus-node-exporter
      serviceAccountName: exporter-prometheus-node-exporter
      terminationGracePeriodSeconds: 30
      volumes:
      - hostPath:
          path: /proc
          type: ""
        name: proc
      - hostPath:
          path: /sys
          type: ""
        name: sys
  templateGeneration: 1
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
status:
  currentNumberScheduled: 3
  desiredNumberScheduled: 3
  numberMisscheduled: 0
  numberReady: 0
  numberUnavailable: 3
  observedGeneration: 1
  updatedNumberScheduled: 3

Solution

  • 3 node(s) didn't have free ports for the requested pod ports.

    From the error it shows that allocated node port is already in use. As you define hostPort: 9100, it limits the number of places the pod can be scheduled, because each <hostIP, hostPort, protocol> combination must be unique. Ref: https://kubernetes.io/docs/concepts/configuration/overview/#services