Search code examples
jenkinsazure-aksacr

ImagePullBackOff with "rpc error: code = Unknown desc = failed to pull and unpack image" from AKS when pulling from ACR


When pulling a service-jenkins custom image from ACR, AKS gives the following error:

Warning Failed 0s (x2 over 31s) kubelet Failed to pull image "XXX.azurecr.io/service-jenkins:latest": [rpc error: code = Unknown desc = failed to pull and unpack image "XXX.azurecr.io/service-jenkins:latest": failed to extract layer sha256:XXX: unexpected EOF: unknown, rpc error: code = Unknown desc = failed to pull and unpack image "XXX.azurecr.io/service-jenkins:latest": failed to resolve reference "XXX.azurecr.io/service-jenkins:latest": failed to authorize: failed to fetch anonymous token: unexpected status: 401 Unauthorized]

We have taken the following steps in an attempt to resolve the issue:

  1. Connected AKS with ACR using SP instead of using secret stored in the same namespace
  2. Uploaded a sample hello-world image which gets pulled successfully by the AKS
  3. Verified the image secret matches with the ACR keys

We pulled and executed the service-jenkins image using local docker engine to check if there is some issue with image building, but the container is executing normally.

We are unable to pinpoint the exact issue. Any help is appreciated!


Solution

  • It turns out this specific issue occurs when

    1. AKS K8 version > 1.18.xx
    2. Ubuntu 20.10 docker base image is used

    On deep diving into the issue, it seems like Ubuntu 20.10 has some layer duplication which doesn't fare well with MSFT's implementation of K8 containerd runtime.

    I'm no expert but this is the only difference I noticed on Azure since we also tried the same deployments with IBM Cloud and that seems to function per expectation.

    Simply uprading the Ubuntu base to 21.04 fixed the issue for me :)