Search code examples
gokuberneteskubernetes-operator

Error "the object has been modified" on k8s operator by golang


Error "the object has been modified" on k8s operator

import "sigs.k8s.io/controller-runtime"

There are many discussions about this kind of Error. The main answer is that "this issue happens because I have old version of the object when I try to update." But I also have some questions. In my operator, for some scenario, I need to update annotation for pod 2 times during one 'Reconcile' invoke. of course, I often get error about "the object has been modified".

Question: I want to know where the 'r.Get()' and 'r.Update()' get/update the object? from local cache, or API server?

1: I thought that 'r.Get()' is get object from 'cache', and 'r.Update()'is update object to cache, is it right? If so, why can I get this error? If the pod object was changed due to something other than operator, I will cannot update my pod object during current 'reconcile' forever?(due to cache object is local, It's already out of sync with the API server.)Why 'retry' for some times can get the latest object?

import "sigs.k8s.io/controller-runtime"

var pod corev1.Pod
if err := r.Get(ctx, req.NamespacedName, &pod); err != nil {
    if !apierrors.IsNotFound(err) {
        log.Error(err, "unable to get pod")
        return ctrl.Result{}, err
    }
}

if err := r.Update(ctx, &pod); err != nil {
    log.Error(err, "unable to update chaosctl status")
    return ctrl.Result{}, err
}

2: if 'r.Get()' is get object from API server, and 'r.Update()' update API server too. why I need to do some retry for update object?


Solution

  • when you use r.Get() and r.Update() in a Kubernetes controller, the interactions with the API server involve both the local cache and the API server itself.

    r.Get():

    The r.Get() function retrieves the object from the local cache if it is present; otherwise, it fetches it from the API server. If the object is present in the local cache, it is returned immediately. If not, a request is made to the API server to fetch the object, and the retrieved object is then stored in the local cache for subsequent use. r.Update():

    The r.Update() function updates the object in both the local cache and the API server. If the object has been modified in the local cache since it was initially retrieved, the Update operation to the API server may fail with a "Conflict" error. This happens when the version of the object in the cache does not match the version on the API server, indicating that someone else has modified the object in the meantime.

    There can be strategies to handle it -

    1. Optimistic Concurrency Control (OCC): - match version before update.
    2. retry -
        retryErr := retry.RetryOnConflict(retry.DefaultRetry, func() error {
            return r.Update(ctx, pod)
        })
        if retryErr != nil {
            return retryErr
        }