Search code examples
azurekubernetesazure-aksacr

Unable to pull new image with AKS and ACR


I'm suddenly having issues pulling the latest image from Azure container registry with AKS (which previously worked fine.

If I run

kubectl describe pod <podid> I get:


Failed to pull image <image>: rpc error: code = Unknown desc = Error response from daemon: Get <image>: unauthorized: authentication required

I've tried logging into the ACR manually and it's all working correctly - the new images have pushed correctly and I can pull them manually.

Moreover I've tried:

az aks update -g MyResourceGroup -n MyManagedCluster --attach-acr acrName

Which succeeds (no errors, there is an output propagation being successful) but it still doesn't work.

I've tried updating the credentials with:

az aks update-credentials --resource-group <group>--name <aks name>--reset-service-principal --service-principal <sp id> --client-secret <client-secret>

Which spits out a rather weird message:

Deployment failed. Correlation ID: 6e84754a-821d-4a39-a0df-7ab9ba21973f. 
Unable to get log analytics workspace info. Resource ID: 
/subscriptions/<subscription id>/resourcegroups/defaultresourcegroup- 
weu/providers/microsoft.operationalinsights/workspaces/defaultworkspace- 
d259e6ea-8230-4cb0-a7a8-7f0df6c7ef18-weu. Details: autorest/azure: Service 
returned an error. Status=404 Code="ResourceGroupNotFound" 
Message="Resource group 'defaultresourcegroup-weu' could not be found.". For 
more details about how to create and use log analytics workspace,  please 
refer to: https://aka.ms/new-log-analytics

I tried creating a new log analytics workspace and the error above persisted.

I've also tried steps from:

This link

This SO post

As well as this post

Besides the the posts above, I've gone through many tutorials and Microsoft pages to try fix the problem.

I've tried creating a new service principal and assigning it the appropriate roles but the error still persists. I've also dabbled with creating new secrets and had no success.

My pods that don't need new images are all running as expected. If I look at my app registrations (under azure active directory) they were all created a year ago - so I'm concerned something expired and I don't know how to fix it.


Solution

  • Got this working by disabling the Log Analytics addon using:

    az aks disable-addons -a monitoring -n <AKSName> -g <ResourceGroupName>

    As per one of the error messages I posted, it seems my log analytics was just causing things to fall apart (not sure why though) so disabled it for the time being and was able to update creds with

    az aks update-credentials --resource-group <group>--name <aks name>--reset-service-principal --service-principal <sp id> --client-secret <client-secret>