Search code examples
kubernetesgoogle-kubernetes-enginekubectl

Automatic Down- and Upscaling of Kubernetes Cluster Depending on Request Frequency


I have a small Web Application running on a Google Kubernetes Cluster. But I want to save some money, because the web app does not get much traffic.

Thus my goal is to automatically downscale my Kubernetes cluster to 0 nodes if there was no traffic for more than some amount of time. And of course it should automatically spin up a node if there is incoming traffic.

Any ideas on how to do this?


Solution

  • The GKE autoscaler scales up only when there are pods to be scheduled that do not fir on any current nodes and scaling up would allow the pod to be scheduled.

    Scaling down occurs whenever a node is using less than half it's total memory and CPU, and all the pods running on the node can be scheduled on another node.

    This being said, the autoscaler will never scale a cluster down to 0 as the reuirements for that can't be met.

    However, you can configure Horizontal Pod Autoscaling for your application deployment. You can configure HPA to scale up or down based on the number of HTTP requests using a custom metric. Despite this, HPA should also not scale the deployment all the way down to 0, nor should it scale up from 0.

    If you configure HPA properly, enable cluster autoscaling, and plan how your pods are being deployed by leveraging taints, tolerations, and affinity, then you can optimze autoscaling so that your cluster will scale down to a minimal size. But it still will not be 0.

    All this being said, if you are running just a simple application with extended downtime, you may want to consider using Cloud Run or App Engine as those will be easier to manage than GKE and will have far less overhead (and likely less cost).