Search code examples
kubernetesfluxcd

FluxCD on Azure AKS: Reconciler errors


I had to rerun flux bootstrap... on my cluster after a colleague accidentally ran flux bootstrap... on their new cluster using the existing branch and cluster from the same flux repo.

Running kubectl get gitrepositories -A has no errors -

flux-system flux-system ssh://[email protected]:7999/psmgsbb/flux.git stored artifact for revision 'master/252f6416c034bb67f06cc3e413e66704bc6b1069'

however I am seeing these errors now when I run flux logs --level=error

error ImagePolicy/post-processing-master-branch-policy.flux-system : Reconciler error cannot determine latest tag for policy version list argument cannot be empty

error HelmRelease/post-processing.post-processing-dev : Reconciler error previous release attempt remediation failed

error ImageRepository/post-processing-repository.flux-system : Reconciler error auth for "myacr.azurecr.io" not found in secret flux-system/psbombb-image-acr-auth-cc8mg5tk84

Regarding the secret above I ran:

kubcetl get secret -n flux-system psbombb-image-acr-auth-cc8mg5tk84 -oyaml

which gave me

apiVersion: v1 
data: 
.dockerconfigjson: 
   ewoJImRhdGEiOiAie1xuICBcI...<redacted>
 kind: Secret

which decodes to

"data": "{ 
    "auths": { 
      "myacr.azurecr.io": { 
         "auth": 
          "YTNlMTNlOGItYWQwNi00M2IzLTkyMjgtMjA0ZmQ2ODllMD<redacted>" 
       }
    }
 }"

So the ACR above myacr.azurecr.io does match the ACR in the secret. This error doesn't make sense to me?

Reconciler error auth for "myacr.azurecr.io" not found in secret flux-system/psbombb-image-acr-auth-cc8mg5tk84

So basically, do you know why reconcile fails now after a flux bootstrap?

Thank you


Solution

  • When flux bootstrap... was run accidentally on the cluster it upgraded kustomize to version 0.30.2. This was causing an issue with the formatting of the encrypted dockerconfigjson secret being written to Kubernetes.

    When the dockerconfigjson contents were base64 decoded there were line feeds everywhere which seems to have caused the reconciler error whereby it could not find the ACR reference -> myacr.azurecr.io

    I reverted the gotk-components.yaml kustomize-controller version back to the kustomize version prior to the accidental flux boostrap... i.e. from v03.2 to v022.3.

    Once the Kubernetes Secret was recreated with the correct dockerconfigjson format, reconciliation started working correctly.