Problem
I'm trying to deploy a pod, which is failing with an error I can't understand. The pod is run via Airflow to execute a particular task. Airflow shows the pod as failing, without any logs. When I run kubectl describe pod my-pod
I get the following output.
What should I do to determine the root cause of the issue?
The failing container section:
base:
Container ID: <ID>
Image: <IMAGE>
Image ID: <ID>
Port: <none>
Host Port: <none>
Command:
airflow
run
/var/airflow/my_dag_name.py
task_name
2023-02-20T23:15:00+00:00
--local
--pool
default_pool
-sd
/var/airflow/my_dag_name.py
State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 20 Feb 2023 20:55:07 -0600
Finished: Mon, 20 Feb 2023 20:55:11 -0600
Ready: False
Restart Count: 0
Limits:
cpu: 1
ephemeral-storage: 100Gi
memory: 8Gi
Requests:
cpu: 500m
ephemeral-storage: 1Gi
memory: 8Gi
Environment:
<ENV VARS>
Mounts:
<VARIOUS MOUNTS>
The events section (this is complete):
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 58s default-scheduler Successfully assigned <TASK> to <IP>
Normal Pulled 58s kubelet Container image <SIDECAR IMAGE 1> already present on machine
Normal Created 57s kubelet Created container <SIDECAR CONTAINER 1>
Normal Started 57s kubelet Started container <SIDECAR CONTAINER 1>
Normal Pulling 54s kubelet Pulling image <SIDECAR IMAGE 2>
Normal Pulled 53s kubelet Successfully pulled image <SIDECAR IMAGE 2> in 125.691281ms
Normal Created 53s kubelet Created container <SIDECAR CONTAINER 2>
Normal Started 53s kubelet Started container <SIDECAR CONTAINER 2>
Normal Pulled 52s kubelet Container image <FAILING POD IMAGE> already present on machine
Normal Created 52s kubelet Created container <FAILING POD CONTAINER>
Normal Started 52s kubelet Started container <FAILING POD CONTAINER>
Normal Pulled 52s kubelet Container image <SIDECAR IMAGE 3> already present on machine
Normal Created 52s kubelet Created container <SIDECAR CONTAINER 3>
Normal Started 52s kubelet Started container <SIDECAR CONTAINER 3>
Normal Pulled 52s kubelet Container image <SIDECAR IMAGE 4> already present on machine
Normal Created 52s kubelet Created container <SIDECAR CONTAINER 4>
Normal Started 51s kubelet Started container <SIDECAR CONTAINER 4>
Context
The pods use these temporary sidecars to connect to systems / inject information / etc.
In Kubernetes, to diagnose issues with pods ,container exit codes are very helpful. If a pod is unhealthy,the problem can be found by using the below command
kubectl describe pod [POD_NAME]
you have already provided the output of it which shows the information as follows :
State: Terminated
Reason: Error
Exit Code: 1
Since the container is terminated with Exit Code 1, a thorough investigation needs to be done on the container and its applications as it is mainly due to an application error or an invalid reference.
As a first step as suggested by Harsh Manvar, Please check the logs of a concerned pod by using the below command to retrieve the logs for the first container in the pod.
kubectl logs <pod-name> -p
-p stands for -previous which means If the pod has been restarted it will return the logs for the previous instance of the pod.
The logs will reveal the root cause of the exit code 1 and this information can be used to fix the command field in the pod’s YAML file. Once updated, please re-apply it to the cluster with kubectl apply command.
The above information is derived from the link which is written by James Walker.