Heyall,
We've run into a problem with some pods going out of sync on our kafka cluster after pod restarts during GCP automated nodepool upgrades. I'm trying to investigate whether the readiness probe is testing when Kafka is fully in sync and functional, or if it reports readiness before that point.
The Kafka strimzy image points at the following readiness probe, however going into the pod and cat
-ing this file its just empty?
exec:
command:
- test
- -f
- /var/opt/kafka/kafka-ready
failureThreshold: 3
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
No, the readiness check in Strimzi does not check whether the whole cluster is in sync or not. It checks if the Kafka broker is ready to accept new connections. You can use the Strimzi Drain Cleaner utility to help to handle voluntary evictions without breaking the cluster availability. For more info, see https://strimzi.io/docs/operators/latest/full/deploying.html#rolling_pods_using_the_strimzi_drain_cleaner or https://github.com/strimzi/drain-cleaner