I am trying to resolve this error from Kubernetes (here is the 'describe pod' output, redacted some parts to keep sensitive data out of this post):
~ $ kubectl describe pod service-xyz-68c5f4f99-mn7jl -n development
Name: service-xyz-68c5f4f99-mn7jl
Namespace: development
Priority: 0
Node: autogw.snancakp/<REDACTED>
Start Time: Thu, 24 Aug 2023 09:55:21 -0400
Labels: app=service-xyz
pod-template-hash=68c5f4f99
Annotations: cni.projectcalico.org/containerID: 7c93d46d14e9101887d58a7b4627fd1679b8435a439bbe46a96ec11b36d44981
cni.projectcalico.org/podIP: <REDACTED>/32
cni.projectcalico.org/podIPs: <REDACTED>/32
Status: Running
IP: <REDACTED>
IPs:
IP: <REDACTED>
Controlled By: ReplicaSet/service-xyz-68c5f4f99
Containers:
service-xyz:
Container ID: containerd://<REDACTED>
Image: gitlab.personal.local:5050/teamproject/service-xyz:latest
Image ID: gitlab.personal.local:5050/teamproject/service-xyz@sha256:<REDACTED>
Port: <none>
Host Port: <none>
State: Waiting
Reason: ImagePullBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 24 Aug 2023 09:55:27 -0400
Finished: Thu, 24 Aug 2023 09:55:27 -0400
Ready: False
Restart Count: 0
Environment:
QUEUE: service-xyz
EXCHANGE: <set to the key 'environment' in secret 'teamproject-secrets'> Optional: false
RMQ_URL: <set to the key 'rmq_url' in secret 'teamproject-secrets'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rtm67 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-rtm67:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m default-scheduler Successfully assigned development/service-xyz-68c5f4f99-mn7jl to machine.goodmachine
Normal Pulled 6m55s kubelet Successfully pulled image "gitlab.personal.local:5050/teamproject/service-xyz:latest" in 5.137384512s (5.137391155s including waiting)
Normal Created 6m54s kubelet Created container service-xyz
Normal Started 6m54s kubelet Started container service-xyz
Normal Pulling 6m11s (x4 over 7m) kubelet Pulling image "gitlab.personal.local:5050/teamproject/service-xyz:latest"
Warning Failed 6m11s (x3 over 6m52s) kubelet Failed to pull image "gitlab.personal.local:5050/teamproject/service-xyz:latest": rpc error: code = Unknown desc = failed to pull and unpack image "gitlab.personal.local:5050/teamproject/service-xyz:latest": failed to resolve reference "gitlab.personal.local:5050/teamproject/service-xyz:latest": failed to authorize: failed to fetch oauth token: unexpected status: 401 Unauthorized
Warning Failed 6m11s (x3 over 6m52s) kubelet Error: ErrImagePull
Normal BackOff 5m34s (x2 over 6m26s) kubelet Back-off pulling image "gitlab.personal.local:5050/teamproject/service-xyz:latest"
Warning Failed 5m34s (x2 over 6m26s) kubelet Error: ImagePullBackOff
Warning BackOff 119s (x18 over 6m52s) kubelet Back-off restarting failed container service-xyz in pod service-xyz-68c5f4f99-mn7jl_development(561cbfc0-addd-4da3-ae6b-dccc2dfa68eb)
So the error is an ImagePullBackOff, so I figured I didn't set up my gitlab-ci, secrets, or pod/deployment yaml correct.
Here is gitlab-ci.yml
:
stages:
- build
- deploy
variables:
SERVICE_NAME: "service-xyz"
services:
- name: docker:dind
alias: dockerservice
build_image:
tags:
- self.hosted
- linux
stage: build
image: docker:latest
variables:
DOCKER_HOST: tcp://dockerservice:2375/
DOCKER_DRIVER: overlay2
DOCKER_TLS_CERTDIR: ""
before_script:
- docker login -u "$CI_REGISTRY_USER" -p "$CI_REGISTRY_PASSWORD" "$CI_REGISTRY"
script:
- docker build -t $CI_REGISTRY_IMAGE .
- docker push $CI_REGISTRY_IMAGE
only:
- main
deploy:
tags:
- self.hosted
- linux
stage: deploy
image:
name: bitnami/kubectl:latest
entrypoint: [""]
variables:
NAMESPACE: development
SECRET_NAME: regcred
before_script:
- export KUBECONFIG=$KUBECONFIG_DEVELOPMENT
script:
- kubectl delete secret -n ${NAMESPACE} ${SECRET_NAME} --ignore-not-found
- kubectl create secret -n ${NAMESPACE} docker-registry ${SECRET_NAME} --docker-server=${CI_REGISTRY} --docker-username=${CI_REGISTRY_USER} --docker-password=${CI_REGISTRY_PASSWORD} --docker-email=${GITLAB_USER_EMAIL}
- kubectl patch serviceaccount default -p '{"imagePullSecrets":[{"name":"'$SECRET_NAME'"}]}' -n $NAMESPACE
- kubectl create namespace $NAMESPACE --dry-run=client -o yaml | kubectl apply -f -
- kubectl apply -f deployment.yaml -n $NAMESPACE
only:
- main
when: manual
Here is the deployment file (deployment.yaml) (which references regcred
):
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: service-xyz
spec:
replicas: 1
selector:
matchLabels:
app: service-xyz
template:
metadata:
labels:
app: service-xyz
spec:
containers:
- name: service-xyz
image: gitlab.personal.local:5050/teamproject/service-xyz:latest
imagePullPolicy: Always
env:
- name: QUEUE
value: service-xyz
- name: EXCHANGE
valueFrom:
secretKeyRef:
name: teamproject-secrets
key: environment
- name: RMQ_URL
valueFrom:
secretKeyRef:
name: teamproject-secrets
key: rmq_url
imagePullSecrets:
- name: regcred
When I look at the secrets in Kuberenetes (through Rancher), I see the regcred
in the correct place:
I believe I've set up everything correctly, and I don't know why the deployment won't work. The deployment correctly references regcred
, but I'm still getting the ImagePullBackOff error.
Can anyone help me out here?
Regards and thanks
As per Gitlab documentation
CI_REGISTRY_PASSWORD
The password to push containers to the project’s GitLab Container Registry. Only available if the Container Registry is enabled for the project. This password value is the same as the CI_JOB_TOKEN and is valid only as long as the job is running.
Use the CI_DEPLOY_PASSWORD for long-lived access to the registry.
So based on this it looks like your secret would no longer be valid once your job finishes(which it does immediately after trying to creating your kubernetes deployment object) and in the meantime your Pod is trying to pull in the image from registry using the invalid token/password(CI_REGISTRY_PASSWORD)
Try creating and using a deploy token instead
https://docs.gitlab.com/ee/user/project/deploy_tokens/index.html#gitlab-deploy-token