kubernetes rabbitmq kubernetes-helm bitnami

How to auto scale helm chart rabbitmq statefulset

I have installed the rabbitmq through helm from bitnami

It's documentation talks about manual horizontal scaling which I understand I wonder about the auto scaling, which could be really handly. I wonder if it would be possible/safe to get an autoscaling setup without data loss

Solution

It is recommended to use declarative helm upgrade command with source version controlled values yaml to scale up or down the rabbitmq cluster instead of imperative kubectl scale command because then it is source version controlled and if you need to rerun the helm upgrade command after changing some other values, it will use the correct replica count.

I dont see a horizontal pod autoscaler https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ template in https://github.com/bitnami/charts/tree/master/bitnami/rabbitmq/templates so I dont think it has autoscaling feature inbuilt. Horizontal Pod Autoscaler v2beta2 API has many features to stabilize downscaling.

As mentioned in https://github.com/bitnami/charts/tree/master/bitnami/rabbitmq#horizontal-scaling, when you scale down a cluster, helm/kubernetes will only remove the pod and it will not remove the rabbitmq node from the cluster using rabbitmqctl forget_cluster_node command. Also it will not delete the pvc associated with the pod. Because the pvc is not deleted, if you scale up the cluster again, it will use the same pvc since it uses statefulset. You will have to manually run rabbitmqctl forget_cluster_node command and delete pvc as mentioned in the document.

Please note that as per https://www.rabbitmq.com/clustering.html#cluster-membership, if you dont use queue type that supports replication and some problem happens with the pv bound for some pvc associated with a pod replica or if one of the pods is evicted or crashes or killed by OOM Killer or the Node on which is it running fails, then that will lead to message data loss. Also, if you scale down your cluster, then in this case also, it could lead to message data loss. So best way to avoid message data loss is to use queue type that supports replication and also using Volume Snapshots https://kubernetes.io/docs/concepts/storage/volume-snapshots/

I recommend to read below as well to understand statefulset replicas. https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/ https://kubernetes.io/docs/tasks/run-application/scale-stateful-set/