Search code examples
elasticsearchkubernetescrashloopbackoff

Elasticsearch statefulset in kubernetes pod state is CrashLoopBackOff


I am trying to create elasticsearch statefulset in kubernetes but my pods keep changing state from running to error to CrashLoopBackOff to running and goes on I have 2 replicas, Minikube is running with 8 cpu's with the memory of 15gn ,why my laptop almost hangs up when the pod is in running state and the system memory usage history shows the memory at 90% then the pod goes back to CrashLoopBackOff here is the output of kubectl get pods -w

NAME                 READY   STATUS             RESTARTS      AGE
elastic-stateful-0   0/1     CrashLoopBackOff   3 (42s ago)   70m
elastic-stateful-1   0/1     CrashLoopBackOff   3 (20s ago)   3m25s
elastic-stateful-0   1/1     Running            4 (51s ago)   70m
elastic-stateful-0   0/1     Error              4 (72s ago)   71m
elastic-stateful-1   1/1     Running            4 (50s ago)   3m55s
elastic-stateful-0   0/1     CrashLoopBackOff   4 (12s ago)   71m
elastic-stateful-1   0/1     Error              4 (70s ago)   4m15s
elastic-stateful-1   0/1     CrashLoopBackOff   4 (11s ago)   4m26s
elastic-stateful-0   1/1     Running            5 (90s ago)   72m
elastic-stateful-1   1/1     Running            5 (86s ago)   5m41s
elastic-stateful-0   0/1     Error              5 (111s ago)   72m
elastic-stateful-0   0/1     CrashLoopBackOff   5 (14s ago)    73m
elastic-stateful-1   0/1     Error              5 (110s ago)   6m5s
elastic-stateful-1   0/1     CrashLoopBackOff   5 (16s ago)    6m20s

kubectl describe pod elastic-stateful-0

shows

ame:         elastic-stateful-0
Namespace:    default
Priority:     0
Node:         minikube/192.168.49.2
Start Time:   Fri, 10 Mar 2023 20:21:08 +0500
Labels:       app=elastic-label
              controller-revision-hash=elastic-stateful-766d849885
              statefulset.kubernetes.io/pod-name=elastic-stateful-0
Annotations:  <none>
Status:       Running
IP:           172.17.0.3
IPs:
  IP:           172.17.0.3
Controlled By:  StatefulSet/elastic-stateful
Containers:
  elastic-container:
    Container ID:   docker://bab1650f5014677283cccb030c1c91d949096888671dbf6b285ac32ff1ad126d
    Image:          elasticsearch:8.4.3
    Image ID:       docker-pullable://elasticsearch@sha256:bb72a5788e156171b111d2fc21825d007f235c3314295aa86d0ef500678923bd
    Port:           9200/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    78
      Started:      Fri, 10 Mar 2023 21:30:37 +0500
      Finished:     Fri, 10 Mar 2023 21:31:00 +0500
    Ready:          False
    Restart Count:  3
    Environment:
      discovery.type:                     <set to the key 'es.discovery.type' of config map 'elasticsearch-configmap'>                     Optional: false
      xpack.security.enabled:             <set to the key 'es.xpack.security.enabled' of config map 'elasticsearch-configmap'>             Optional: false
      xpack.security.enrollment.enabled:  <set to the key 'es.xpack.security.enrollment.enabled' of config map 'elasticsearch-configmap'>  Optional: false
      xpack.security.http.ssl.enabled:    <set to the key 'es.xpack.security.http.ssl.enabled' of config map 'elasticsearch-configmap'>    Optional: false
      ingest.geoip.downloader.enabled:    <set to the key 'es.ingest.geoip.downloader.enabled' of config map 'elasticsearch-configmap'>    Optional: false
      discovery.seed_hosts:               elastic-stateful-0.elastic-service.default.svc.cluster.local,elastic-stateful-1.elastic-service.default.svc.cluster.local
      cluster.initial_master_nodes:       elastic-stateful-0
    Mounts:
      /usr/share/elasticsearch/data from elastic-pvc (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kcgkg (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  elastic-pvc:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  elastic-pvc-elastic-stateful-0
    ReadOnly:   false
  kube-api-access-kcgkg:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason   Age                   From     Message
  ----     ------   ----                  ----     -------
  Warning  Failed   45m (x2 over 69m)     kubelet  Error: ErrImagePull
  Warning  Failed   45m                   kubelet  Failed to pull image "elasticsearch:8.4.3": rpc error: code = Unknown desc = context canceled
  Normal   Pulling  44m (x3 over 70m)     kubelet  Pulling image "elasticsearch:8.4.3"
  Normal   Pulled   35m                   kubelet  Successfully pulled image "elasticsearch:8.4.3" in 9m30.225714793s
  Warning  Failed   33m (x11 over 35m)    kubelet  Error: configmap "elasticsearch-configmap" not found
  Normal   Pulled   5m3s (x142 over 35m)  kubelet  Container image "elasticsearch:8.4.3" already present on machine
  Warning  BackOff  3s (x5 over 2m13s)    kubelet  Back-off restarting failed container

here is the manifest files

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elastic-stateful

spec:
  serviceName: elastic-service
  replicas: 2
  selector:
    matchLabels:
      app: elastic-label
  template:
    metadata:
      name: elastic-pod
      labels:
        app: elastic-label
    spec:
      containers:
        - name: elastic-container
          image: elasticsearch:8.4.3
          ports:
            - containerPort: 9200
          env:
            - name: discovery.type
              valueFrom:
                configMapKeyRef:
                  name: elasticsearch-configmap
                  key: es.discovery.type
            - name: xpack.security.enabled
              valueFrom:
                configMapKeyRef:
                  name: elasticsearch-configmap
                  key: es.xpack.security.enabled
            - name: xpack.security.enrollment.enabled
              valueFrom:
                configMapKeyRef:
                  name: elasticsearch-configmap
                  key: es.xpack.security.enrollment.enabled
            - name: xpack.security.http.ssl.enabled
              valueFrom:
                configMapKeyRef:
                  name: elasticsearch-configmap
                  key: es.xpack.security.http.ssl.enabled
            - name: ingest.geoip.downloader.enabled
              valueFrom:
                configMapKeyRef:
                  name: elasticsearch-configmap
                  key: es.ingest.geoip.downloader.enabled
            - name: discovery.seed_hosts
              value: "elastic-stateful-0.elastic-service.default.svc.cluster.local,elastic-stateful-1.elastic-service.default.svc.cluster.local"
            - name: cluster.initial_master_nodes
              value: "elastic-stateful-0"
          volumeMounts:
            - name: elastic-pvc
              mountPath: /usr/share/elasticsearch/data
  volumeClaimTemplates:
    - metadata:
        name: elastic-pvc
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 1Gi


---
apiVersion: v1
kind: Service
metadata:
  name: elastic-service

spec:
#  type: ClusterIP
  clusterIP: None
  selector:
    app: elastic-label
  ports:
    - protocol: TCP
      port: 9200
      targetPort: 9200

configmap file

apiVersion: v1
kind: ConfigMap
metadata:
  name: elasticsearch-configmap
data:
  es.discovery.type: "multi-node"
  es.xpack.security.enabled: "false"
  es.xpack.security.enrollment.enabled: "false"
  es.xpack.security.http.ssl.enabled: "false"
  es.ingest.geoip.downloader.enabled: "false"

i have an 16gb of ram with core i7


Solution

  • Based on the kubectl events : Looks like the image size is too big, try to reduce the image size.

    Refer to Rafael Benevide’s Redhat Article on Keep it small: a closer look at Docker image sizing which may help to resolve your issue.

    If your image size is too big try to increase --image-pull-progress-deadline and add to your kubelet.service:ExecStart=--image-pull-progress-deadline=10m and also run the below commands:

    sudo systemctl daemon-reload
    sudo systemctl restart restart kubelet
    

    Also update the latest docker version and clean up docker resources, if required re-install docker.

    Temporary workaround : Use docker pull IMAGE_NAME command to pull that image manually. If you pull an image using docker make sure to set up deployment spec: imagepullpolicy : ifNotPresent.

    Note : If the data is shipped with image, you can use Persistent Volumes to handle the data required for the workload.