Search code examples
amazon-web-serviceskubernetescicdargocdcrossplane

Crossplane Provider Upgrade issue: Only one reference can have Controller set to true. Found "true" in references for Provider/x and Provider/x


We have a ArgoCD setup running in kind, where Crossplane is installed as ArgoCD Application (example repository here). Crossplane Providers are also installed via an ArgoCD Application like this:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: provider-aws
  namespace: argocd
  labels:
    crossplane.jonashackt.io: crossplane
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: https://github.com/jonashackt/crossplane-argocd
    targetRevision: HEAD
    path: upbound/provider-aws/provider
  destination:
    namespace: default
    server: https://kubernetes.default.svc
  syncPolicy:
    automated:
      prune: true    
    retry:
      limit: 5
      backoff:
        duration: 5s 
        factor: 2 
        maxDuration: 1m

The Provider is defined like this in the Argo spec.source.path:

apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
  name: provider-aws-s3
spec:
  package: xpkg.upbound.io/upbound/provider-aws-ec2:v1.1.1
  packagePullPolicy: Always
  revisionActivationPolicy: Automatic
  revisionHistoryLimit: 1

Now as a new Crossplane provider version provider-aws-ec2:v1.2.1 got released, we saw the following issue: The provider gets in the Degraded state:

enter image description here

And as an event we got the following error:

cannot apply package revision: cannot create object: ProviderRevision.pkg.crossplane.io "provider-aws-ec2-150095bdd614" is invalid: metadata.ownerReferences: Invalid value: []v1.OwnerReference{v1.OwnerReference{APIVersion:"pkg.crossplane.io/v1", Kind:"Provider", Name:"provider-aws-ec2", UID:"30bda236-6c12-412c-a647-b96368eff8b6", Controller:(*bool)(0xc02afeb38c), BlockOwnerDeletion:(*bool)(0xc02afeb38d)}, v1.OwnerReference{APIVersion:"pkg.crossplane.io/v1", Kind:"Provider", Name:"provider-aws-ec2", UID:"ee890f53-7590-4957-8f81-e92b931c4e8d", Controller:(*bool)(0xc02afeb38e), BlockOwnerDeletion:(*bool)(0xc02afeb38f)}}: Only one reference can have Controller set to true. Found "true" in references for Provider/provider-aws-ec2 and Provider/provider-aws-ec2

Looking into kubectl get providerrevisions we saw, that the new Provider got already installed (without us doing anything) and the 'old' Provider beeing not HEALTHY anymore:

kubectl get providerrevisions
NAME                                       HEALTHY   REVISION   IMAGE                                                STATE      DEP-FOUND   DEP-INSTALLED   AGE
provider-aws-ec2-3d66ea2d7903              Unknown   1          xpkg.upbound.io/upbound/provider-aws-ec2:v1.1.1      Active     1           1               5m31s
provider-aws-ec2-3d66ea2d7903              Unknown   1          xpkg.upbound.io/upbound/provider-aws-ec2:v1.2.1      Active     1           1               5m31s
upbound-provider-family-aws-7cc64a779806   True      1          xpkg.upbound.io/upbound/provider-family-aws:v1.2.1   Active                                 30m

What can we do to prevent the Provider Upgrades breaking our setup?


Solution

  • As ArgoCD is doing the GitOps part in this setup, we need to let it take the lead in applying changes that have to be made in Git. With the current Provider setup, Crossplane automatically upgrades Providers without ArgoCD knowing anything about it. And thus trying to reconcile the state to what's stated in Git. Thus both mechanisms will get into an ongoing 'fight'.

    To get ArgoCD into the lead of Provider upgrades through Git commits, we should configure the packagePullPolicy to IfNotPresent instead of Always, which means "Check for new packages every minute and download any matching package that isn’t in the cache" as the docs state:

    apiVersion: pkg.crossplane.io/v1
    kind: Provider
    metadata:
      name: provider-aws-s3
    spec:
      package: xpkg.upbound.io/upbound/provider-aws-ec2:v1.1.1
      packagePullPolicy: IfNotPresent
      revisionActivationPolicy: Automatic
      revisionHistoryLimit: 1
    

    BUT interestingly we need to leave the revisionActivationPolicy to Automatic! Since otherwise, the Provider will never get active and healty! I found the docs aren't that clear on this point here.

    TLDR; with packagePullPolicy: IfNotPresent Crossplane will not automatically pull new Provider versions and only a git commit with a Provider version change will trigger the download - and also the upgrade through revisionActivationPolicy: Automatic`.

    Remember to be a bit patient for the upgrade to run through - it will take up to a few minutes and depends on what the Provider to upgrade has to do right now (we waited to short and thus thought this configuration is wrong, but it is not).