kubernetes kubectl google-kubernetes-engine

Horizontal pod autoscaling in Kubernetes

I have a cluster that scales based on the CPU usage of my pods. The documentation states that i should prevent thrashing by scaling to fast. I want to play around with the autoscaling speed but i can't seem to find where to apply the following flags:

--horizontal-pod-autoscaler-downscale-delay
--horizontal-pod-autoscaler-upscale-delay

My goal is to set the cooldown timer lower then 5m or 3m, does anyone know how this is done or where I can find documentation on how to configure this? Also if this has to be configured in the hpa autoscaling YAML file, does anyone know what definition should be used for this or where I can find documentation on how to configure the YAML? This is a link to the Kubernetes documentation about scaling cooldowns i used.

Solution

The HPA controller is part of the controller manager and you'll need to pass the flags to it, see also the docs. It is not something you'd do via kubectl. It's part of the control plane (master) so depends on how you set up Kubernetes and/or which offering you're using. For example, in GKE the control plane is not accessible, in Minikube you'd ssh into the node, etc.