I'm facing a problem with the image groundnuty/k8s-wait-for. Project at github and repo at dockerhub.
I'm pretty sure that error is in command arguments, as the init container fails with Init:CrashLoopBackOff.
About image: This image is used for init containers, which need to postpone a pod deployment. The script that is in the image waits for the pod or job to complete, after it completes it lets the main container and all replicas start deploying.
In my example, it should wait for a job named {{ .Release.Name }}-os-server-migration-{{ .Release.Revision }}
to finish, and after it detects it is finished it should let main containers start. Helm templates are used.
By my understanding, the job name is {{ .Release.Name }}-os-server-migration-{{ .Release.Revision }}
and the second command argument at the init container in deployment.yml needs to be the same so the init container can depend on the named job. Any other opinions or experiences with this approach?
There are templates attached.
DEPLOYMENT.YML:
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}-os-{{ .Release.Revision }}
namespace: {{ .Values.namespace }}
labels:
app: {{ .Values.fullname }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
app: {{ .Values.fullname }}
template:
metadata:
labels:
app: {{ .Values.fullname }}
spec:
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
containers:
- name: {{ .Chart.Name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- name: http
containerPort: 8080
resources:
{{- toYaml .Values.resources | nindent 12 }}
initContainers:
- name: "{{ .Chart.Name }}-init"
image: "groundnuty/k8s-wait-for:v1.3"
imagePullPolicy: "{{ .Values.init.pullPolicy }}"
args:
- "job"
- "{{ .Release.Name }}-os-server-migration-{{ .Release.Revision }}"
JOB.YML:
apiVersion: batch/v1
kind: Job
metadata:
name: {{ .Release.Name }}-os-server-migration-{{ .Release.Revision }}
namespace: {{ .Values.migration.namespace }}
spec:
backoffLimit: {{ .Values.migration.backoffLimit }}
template:
spec:
{{- with .Values.migration.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
containers:
- name: {{ .Values.migration.fullname }}
image: "{{ .Values.migration.image.repository }}:{{ .Values.migration.image.tag }}"
imagePullPolicy: {{ .Values.migration.image.pullPolicy }}
command:
- sh
- /app/migration-entrypoint.sh
restartPolicy: {{ .Values.migration.restartPolicy }}
LOGS:
Normal Scheduled 46s default-scheduler Successfully assigned development/octopus-dev-release-os-1-68cb9549c8-7jggh to minikube
Normal Pulled 41s kubelet Successfully pulled image "groundnuty/k8s-wait-for:v1.3" in 4.277517553s
Normal Pulled 36s kubelet Successfully pulled image "groundnuty/k8s-wait-for:v1.3" in 3.083126925s
Normal Pulling 20s (x3 over 45s) kubelet Pulling image "groundnuty/k8s-wait-for:v1.3"
Normal Created 18s (x3 over 41s) kubelet Created container os-init
Normal Started 18s (x3 over 40s) kubelet Started container os-init
Normal Pulled 18s kubelet Successfully pulled image "groundnuty/k8s-wait-for:v1.3" in 1.827195139s
Warning BackOff 4s (x4 over 33s) kubelet Back-off restarting failed container
kubectl get all -n development
NAME READY STATUS RESTARTS AGE
pod/octopus-dev-release-os-1-68cb9549c8-7jggh 0/1 Init:CrashLoopBackOff 2 44s
pod/octopus-dev-release-os-1-68cb9549c8-9qbdv 0/1 Init:CrashLoopBackOff 2 44s
pod/octopus-dev-release-os-1-68cb9549c8-c8h5k 0/1 Init:Error 2 44s
pod/octopus-dev-release-os-migration-1-9wq76 0/1 Completed 0 44s
......
......
NAME COMPLETIONS DURATION AGE
job.batch/octopus-dev-release-os-migration-1 1/1 26s 44s
For anyone facing the same issue, I will explain my fix.
Problem was that the containers inside deployment.yaml had no permissions to use Kube API. So, groundnuty/k8s-wait-for:v1.3 container could not check has the job {{ .Release.Name }}-os-server-migration-{{ .Release.Revision }} completed or not. That's why init containers instantly failed with CrashLoopError.
After adding Service Account, role, and role binding everything worked great, and groundnuty/k8s-wait-for:v1.3 successfully waited for the job(migration) to finish, in order to let the main container run.
Here are the examples of the code for the service account, role, and role binding that solved the issue.
sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: sa-migration
namespace: development
role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: migration-reader
rules:
- apiGroups: ["batch","extensions"]
resources: ["jobs"]
verbs: ["get","watch","list"]
role-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: migration-reader
subjects:
- kind: ServiceAccount
name: sa-migration
roleRef:
kind: Role
name: migration-reader
apiGroup: rbac.authorization.k8s.io